Skip to main content

Full text of "Naval research logistics quarterly"

See other formats


^J4i>'o 2^- 



vS 



DEPOSITORY 

1 APR 1977 



NflVOL RfSEfleCH 
LOGISTICS 

oufleeiy 



o _ '- - 

^ . — O 

mr- 



MARCH 1977 
VOL. 24, NO. 1 




OFFICE OF NAVAL RESEARCH 

NAVSO P-1278 



NAVAL RESEARCH LOGISTICS QUARTERLY 



EDITORS 



Murray A. Geislcr 
Logistics Management Institute 



W. H. Marlow 
The George Washington University 



Bruce J. McDonald 
Office of Naval Research 



MANAGING EDITOR 

Seymour M. Selig 
Office of Naval Research 
Arlington, Virginia 22217 



ASSOCIATE EDITORS 



Marvin Dcnicoff 

Office of Naval Research 

Alan J. Hoffman 
IBM Corporation 

Neal D. Glassman 

Office of Naval Research 



Jack Laderman 

Bronx, New York 

Thomas L. Saaty 

University of Pennsylvania 

Henry Solomon 

The George Washington University 



The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics and 
will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, 
relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. 

Information for Contributors is indicated on inside back cover. 

The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, June, 
September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printing 
Office, Washington, D.C. 20402. Subscription Price: $1 1.15 a year in the U.S. and Canada, $13.95 elsewhere. Cost of 
individual issues may be obtained from the Superintendent of Documents. 

The views and opinions expressed in this Journal are those of the authors and not necessarily those of the Office 

of Naval Research. 



Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulations,! 

P-35 (Revised 1-74). j 



A TWO-ECHELON INVENTORY MODEL WITH PURCHASES, DISPOSITIONS, 
SHIPMENTS, RETURNS AND TRANSSHIPMENTS 



Bruce Hoadley and Daniel P. Heyman 

Bell Telephone Laboratories 
Holmdel, New Jersey 



ABSTRACT 

This paper presents a one-period two-echelon inventory model with one 
warehouse in the first echelon and n warehouses in the second echelon. At the 
beginning of the period the stock levels at all facilities are adjusted by purchasing 
or disposing of items at the first echelon, returning or shipping items between the 
echelons and transshipping items within the second echelon. During the period, 
demands (which may be negative) are placed on all warehouses in the second 
echelon and an attempt is made to satisfy shortages either by an expedited shipment 
from the first echelon to the second echelon or an expedited transshipment within 
the second echelon. The decision problem is to choose an initial stock level at the 
first echelon (by a purchase or a disposition) and an initial allocation so as to mini- 
mize the initial stock movement costs during the period plus inventory carrying 
costs and system shortage costs at the end of the period. 

It is shown that the objective function takes on one of four forms, depending 
on the relative magnitudes of the various shipping costs. All four forms of the 
objective function are derived and proven to be convex. 

Several applications of this general model are considered. We also consider 
multi-period extensions of the general model and an important special case is 
solved explicitly. 



1. INTRODUCTION 

In telephone and many military supply systems, when an item is taken out of inventory and 
placed with a customer, the equipment is still owned by the supplying organizations. For example, 
customer station apparatus (e.g., telephone sets) are owned by the operating telephone company 
and aircraft engines or on-board electronic equipment are owned by a branch of 
the military service. These items are used by the customer and then returned for possible repair and 
reuse by another customer. Typically, these items are supplied from a multi-echelon multi-location 
inventory system. 

Issues that must be resolved at various levels in a supply line are how much new material 
should be ordered or how much existing stock should be disposed, and how should the existing 
stock be allocated to the various echelons. This typically complex problem is further complicated 
by the fact that at many locations material returns from a lower echelon can exceed demands, 

1 



B. HOADLEY AND D. P. HEYMAN 




Material 

Management 

Centers 



Service 
Centers 



Work 
Locations 



Figure 1. — A general supply line. 



thus causing a negative net demand. Figure 1 is a schematic diagram of a general supply line where 
the boxes represent the stocking locations and the arrows represent material flows. The output 
from the factories must be allocated to a few very large warehouses we will call material management 
centers. Each material management center supplies a set of service centers or central stocks which 
in turn supply the work locations and receive returned material from them. Finally, the material 
flows from the work locations to the customers and after use the material flows back to the work 
locations. 

The kind of questions that arise in this system are : 

(i) How should the factory output be allocated? 

(ii) When and how much should one echelon order from another? 

(iii) Should the material be stocked at the material management center which then would 
supply the service centers on demand, or should some of the material be stocked at the service 
centers? 

(iv) Should the work locations keep stock, or should it be returned to the service centers? 

(v) When is it economical to transship material between locations in the same echelon? 



TWO-ECHELON INVENTORY MODEL 

1.1. Outline and Motivation 

To study these questions, we will consider the two echelon system shown below: 




first 
echelon 



second 
echelon 



The location in the first echelon will be called facility and the locations in the second echelon 
will be called facilities 1, 2, . . ., n. Shipments from an outside supplier are received only at facility 
0, and external demands and returns occur only at facilities 1,2, . . ., n. We assume that the net 
demands (gross demands minus returns) are independent from location to location and from time 
period to time period. We emphasize that net demand at a second echelon facility may be negative. 

Regular and expedited shipments are allowed between any pair of facilities. Since there is no 
external demand at facility 0, by assuming that it is always less expensive to expedite a shipment 
between any two second echelon facilities than to expedite shipments between them via facility 
zero, we do, in effect, prohibit expedited shipments from the second echelon to the first. Facility 
may dispose of stock but the other facilities can not. 

In Section 2 we present a general theory for such a system. In particular, we consider the fol- 
lowing decision problems. How much inventory should 

a) facility order or dispose ; 

b) facility ship to facilities 1,2, . . ., n; 

c) facility recall from facilities 1,2, . . ., n; 

d) be transshipped between second echelon facilities. We will also obtain the operating char- 
acteristics of the expedited shipments and transshipments. 

In Section 3 it is shown how the general model can be applied to three applications that moti- 
vated this study. They are the stock control problem in a regional material distribution system, the 
returns problem and the transshipment problem in a inventory management system. 

Multi-period extensions of the general model are discussed in Section 4 and a multi-period 
formulation of the stock control problem is solved. 



4 B. HOADLEY AND D. P. HEYMAN 

1. 2. Relation to Previous Work 

Clark [3] is a general survey of multi-echelon systems. Allen [1] considers transshipments among 
several locations (which correspond to our lower echelon) and finds an algorithm to minimize trans- 
shipment plus storage costs. Bessler and Veinott [2] consider a multi-period model of a multi- 
echelon system with regular and expedited redistributions, but they strongly restrict the flows in 
their model by assuming that each location can receive shipments from exactly one other location. 
They obtain bounds on the amount of stock on hand after the regular movements at each location. 
Clark and Scarf [4] consider purchasing policies for a multi-period multi-echelon system. They 
solved the problem when activities are arranged in series. They discuss the problem treated in this 
paper of activities arranged in parallel; they specifically [p. 486] exclude transshipments, claiming 
they are not done in practice; and state that "The theoretical and computational aspects of the 
problem become quite complex." Fukuda [9] extends the (activities in series) model of Clark and 
Scarf by allowing excess stock in each echelon to be disposed of. Gross [8] describes the special case 
of our single-period model where only regular transshipments are allowed and obtains an algorithm 
to find the optimal order quantity at each second echelon facility when there is no a priori effective 
bound on the number of items available. Simpson [12] includes such a bound but does not allow 
transshipments. Das [5] considers a two-location single-echelon single-period model where trans- 
shipments are allowed at a given time epoch in the period; returns are not allowed. He finds condi- 
tions for the cost function to be convex and for no transshipments to be optimal. The single-echelon 
model with expedited transshipments of Krishnan and Rao [10] is a very special case of our model. 

In our opinion, existing multi-echelon inventory theories often fall short of practical usefulness 
because they are overl}^ complicated or contain very restrictive assumptions. In this paper we 
attempt to develop a reasonably simple model that when used with care can help analyze some of 
the questions raised above. 

2. GENERAL FORMULATION AND RESULTS 

We assume that events and costs occur in the following manner. At the start of the period, the 
stocks at each facility are observed. The inventory manager is then given the option of adjusting 
the stock levels at all facilities in the system. The adjustment can be accomplished by (i) purchasing 
items at facility 0, (ii) disposing of items at facility 0, (iii) returning items from the second echelon 
to the first echelon, (iv) shipping items from the first echelon to the second echelon, (v) transshipping 
between facilities in the second echelon. A per unit cost is associated with each of these transactions, 
and we assume that the adjustments chosen all occur instantaneously. 

After the new stock levels are set, the demands at each second echelon facility are realized. These 
demands may be negative, indicating that the number of withdrawals requested was smaller than 
the number of items returned. If the stock at a facility is less than the demand, the manager will 
attempt to satisfy the demand by making an expedited shipment from the first echelon or an expe- 
dited transshipment from another facility. A different holding charge for each echelon is incurred 
for items remaining in the system at the end of the period. The same holding charge applies to all 
second echelon facilities. A system shortage cost is incurred for each demand not satisfied by the 
system. This cost is the same for all facilities and is proportional to the number of unsatisfied 
demands. 



TWO-ECHELON INVENTORY MODEL 5 

2.1. Notations and Definitions 

Let Xi be the stock level at facility i{i—Q, 1, . . ., n) at the beginning of the period (a negative 
Xi can be interpreted as a number of backorders), ?/,>0 be the stock level at facility i after the 
initial adjustments, and Di be the independent (random) demands at facility i(i?^0). Now define 
the quantities 

(1) IN,= (2/,-x,)+ 

(2) OUT,= (a;-y,)+ 

(3) \,= {D,-y,y 

for i=l, 2, . . ., n, where (a) +=max (a, 0). Then at facility i, INj is the number of items initially 
added the stock, OUTj is the number of items initially subtracted from the stock, and Vt is the 
shortage before the expedited adjustments are made. 

Now we consider the various types of stock movements. At the beginning of the period, let 

P= total number of purchases at facility 0, 

J= total number of dispositions (junks, sales) at facility 0, 

(5= total number of shipments from to the second echelon, 

7^= Total number of transshipments among second echelon facilities, 

jR= Total number of returns from second echelon to facility 0. 
These movements will be called regular movements and the per unit costs of each, for all pairs 
of facilities, are Cp, Cj, Cs, Ct and Cr, respectively. 

For expedited adjustments, let 

(S*= total number of expedited shipments from to second echelon, 

7"*= total number of expedited transshipments among facilities in the second echelon. 
The per unit costs of these movements, for all pairs of facilities, are C*s and C*t, respectively. 
Since S* and T* depend on the demands, they are random variables. 

At the end of the period, after the expedited movements have occurred, let 

Fo= system shortage (the excess of total demand over total stock), 
s= system shortage cost per unit short, 

/o= inventory at the first echelon, 

^0= carrying cost per unit of /q, 
/=inventory at the second echelon, 
^= carrying cost per unit of /. 
Since Vq, Iq and / depend on the demands, they are also random variables. 

We note that all movement costs depend only on the number of items moved and not on the 
distance between the facilities involved. Also there are no fixed costs in the model. 

Finally, we define the cost of regular movements by 

(4) C-RM=CpP-\-CjJ+CsS+CrT+CnR. 
the expected cost of expedited movements by 

(5) CEM=C*s-EiS*)+C*T-EiT*), 



6 - B. HOADLEY AND D. P. HEYMAN 

the expected carrying cost by 

(6) CC=ho-E{Io)+h-E(I), 

the expected system shortage cost by 

(7) CSS=s-E{Vo) 

and the total expected cost by the sum of the four costs given above. Our problem is to choose 
nonnegative values of yo, yi, y^, • • ■, yn to minimize the total expected costs. 

2.2. General Results for Inventory Levels and Costs 

To simphfy the formulas, let 

n 

i = l 

and 

Y^yo+j:yi- 
t=i 

By using a conservation argument, one easily estabUshes that 

(8) Io+I=^Y-±D^ 

(9) Vo=^±D,-yJ- 
We can also use conservation arguments to show 

(10a) i:iN,=s+r, 

i=l 

(10b) 20UT,=i?+r, 

and 

(10c) T,yi=i2xi+S-K 

or 

(11) R=i:{Xi-y,)+S. 

i=l 

Substituting (10a) and (11) in (4) we obtain 

(12) CUM^CAY-Xy+Cj{X-Y)++Cal2ix-yi)+Crf:{yi-xd^ + iCs+Cn-Cr)-S. 

i=l 1=1 

Since second echelon shortages are satisfied by either S*, T* or Vq, we obtain 

(13) i2V,=S*+T*+Vo. 

i=l 



TWO-ECHELON INrVENTORY MODEL 7 

Substituting (13) into (5) yields 

which, using (9), can be written as 

(14) C'E:M=C*TE[p^V-Vo']+(C*s-C*r)E[S*]. 
Since 

(15) Io=Y-±.y,-S*, 

1=1 

we can obtain from (8) 

(16) CC=hE[I+Io]+{ho-h)E[Io]=hE]^Y- g D^''+{h-ho) (^E[S*]+i: Vi-Y^- 

From (12), (14) and (16) we can derive that the objective function F can be written as 
F=-hB(±, D^+Cn i: x.+CjX+ih-K-Cn) f: yi+{fh-Cj)Y+Cr Zl (y.-x.)+ 

Lt=l J i = l t = l i = l 

(17) +{Cp+Cj) {Y-X)+-\-C*rE^f: (£>i-2/0+] +{h+s-C*r) ^[s D.-y'J 

+ ih-ho+C*s-C*r)E(S*) + {Cs-\-Ca-C,)S. 
Equation (17) can be written more conveniently by introducing the following functions of 

y^ivo, yi, 2/2, . . •, yn): 



n 



(18a) fiiy)=T.yi, 

i=l 

(18b) My)=i:,iyi-Xiy, 



(18c) My)=[t:iyi-Xi)J, 

(18d) Uy)=E^±{D,-y,)+'j, 

(18e) fs(y)=E^±{D,-y,)J, 

(18f) My)-Ei^±{D,-y,)+-yo]\ 

(i8g) My)=±yu 

i = 

(i8h) - Uy)-={tayi-xJ , 



8 B. HOADLEY AND D. P. HEYMAN 

(18i) My) = El±D,-±yT. 

Li=l i=0 J 

With these definitions, (17) becomes 

(19) F{y) = -hEl±D~\-\-C^ ±, x,+CjX+{h-ho-Oj,)My) + (ho-Cj)My) + Crf2iy) 

Li=l J i=l 

+ iCp+Cj)My) + C*rUy)Mh+s-C*r)My) + {h-ho-hC%-C*r)E(S*)+{Cs-\-Ca-Cr)S. 
Recognizing that S and E(S*) depend on y, we are lead to a study of movement strategies. 

2.3. General Results for Movement Strategies 

The movement strategies will obviously depend on the movement costs. We distinguish four 
cases which are analyzed separately. 

CASE 1: Ct>Cr-\-Cs 

In this case, it is more expensive to transship items between facilities in the second echelon 
than to first return them to the first echelon and then ship them to another facility in the second 
echelon. Hence, all the initial stock increases in the second echelon will be effected by a shipment, so 

(20) S=i: (y^-xd+^My)- 

i=l 

CASE 2: Cr<CR+Cs 

In this case shipments will only be used to satisfy those initial stock increases that cannot be 
satisfied by transshipments, so 

(21) S=^± y,-± x,J=/3(y). 
CASE 3: C*T>C*s 

In this case expedited shipments will be preferred to expedited transshipments, so as many 
shortages as possible will be satisfied with the stock at the first echelon. Thus, 



s*=Mm [±: Vu i/o]=i: ^-[s Vi-yoJ> 



and so 

(22) E(S*)=My)-My). 

CASE 4: C*T<C*s 

In this case expedited transshipments will be used to satisfy as much of the shortages as possible, 



so 



Since 



T*=Min [± {D-y,)\ ± (y^-D.yl- 

Li=l i = l J 

/ n n \ + 

(j:D,-^yA 

\i=l i=l / 



is the demand that cannot be satisfied by the initial allocations or by expedited transshipments, we 
must have 



(i:Di-±yX=S*+Vo; 



therefore, from (9) 

(23) 



TWO-ECHELON INVENTORY MODEL 



E(S*)=Uy)-My)- 



9 



Thus, given the relationships among the movement costs, equations (20)-(23) allow us to 
write (19) in terms of the functions defined in (18), yielding an objective function that is expressed 
in terms of the decision variables and the distributions of the random demands. 

2.4. The Four General Forms of the Objective Function 

The four comparisons between movement costs given in the last section give rise to four 
different cases of the general problem. They are 

PROBLEM I: C*t<C%, Cr>CR-\-Cs, 

PROBLEM II: 0% < C*s, Cr < Cj^+Cs, 

PROBLEM III: C*t>C*s, Ct>Cr+Cs, 

PROBLEM IV: C*t>C*s, Ct<Cr+Cs. 

Using (20) -(23) the objective function (dropping the constant) for each problem is given in 
Table 1 below. 

Table 1 . — Coefficients oj the Objective Function 

Problem 





I 


II 


III 


IV 


hiy) 


Ct*<Cs* 

Ct^Cr+Cs 


Ct"^ Cr-\-Cs 


Cr^C7«+6s 


C T /*Cs 

^ tSi Cje~rCs 


My) 


h—ho — CR 


h — ho — CR 


h — ho — CR 


h — ho—CR 


My) 


Cs'\-Cr 


O jt 


Cs+Cr 


G T 


My) 





Cs-\-Cr — Ct 





Cs-\-C r — Ct 


My) 


C/ T 


Cr* 


h-ho+Cs* 


h-ho+Cs* 


My) 


ih lhQ~\~ i^s Kj T 


h-ho+Cs*-CT* 








My) 








ho-h-\-CT*-Cs* 


ho-h+CT*-Cs* 


My) 


K-Cj 


ho-Cj 


ho-Cj 


ho-Cj 


My) 


Cp+Cj 


Cp+Cj 


Cp+Cj 


Cp+Cj 


My) 


K+s-Cs* 


ho+s-Cs* 


h+S-Cr* 


h+S-Cr* 



2.5. Convexity of the Objective Function 

We will now give some simple sufficient conditions for the objective functions shown in Table 1 
to convex. First, we show tha.tfi{y), i—1, 2, . . .,9 are convex functions. 

THEOREM 1; The functions /^ (2/), i=l, 2, . . ., 9 are convex functions of y. 



10 B, HOADLEY AND D. P. HETMAN 

PROOF: a) Since /i(i/) is a linear function, both —jiiy) and/i(y) are convex, b) The function 
(t/j— Zt)"*" is obviously a convex function of y,, 80/2(2/) is convex because it is a sum of convex func- 
tions, c) jz{y) is the maximum of 

i=l 

and the null function, both of which are convex, sofsiy) is convex, d) Since 

1=1 

is clearly convex in y for any choice of Di, i=l, 2, . . ., n,J^(y) is convex because convexity is pre- 
served under expectations, e) Since 

[p^iD-y,)J 

is convex for any choice of Z),, i=l, . . ., 7i,fs{y) is convex. Since 

i:iD,-yd^-yoY 
U=i I 

is the maximum of a convex function and the null function for any choice of Di, i=l, . . ., n, f^iy) 
is convex. The remainder of the proof follows in a similar manner. Q.E.D. 

So sufficient conditions for the objective functions to be convex are that the coefficients of all 
the/^t's except /i and/7 are nonnegative. A special case of interest for which the above is true is 

(24a) ^=^0 

(24b) Cp>-Cj 

(24c) s>C*s 

(24d) s>C*t; 

i.e., the carrying cost factor is the same for both echelons, the purchase price is greater than the dis- 
position value and the system shortage cost is greater than both the expedited shipping and trans- 
shipment costs. 

If the inequalities in (24) hold, then the problem of finding a nonnegative value of y to minimize 
(19) is a convex programming problem. This type of mathematical programming problem has 
been extensively studied (see, e.g.. Eaves and Saigal [6], Fiacco and McCormick [7], Mifflin [11], 
and Zangwill [13]) and algorithms and computer codes are available to solve it. We should point 
out that the functions /t(y) are not necessarily differentiable. For example, /2(y) is not differentiable 
at those points where yt—Xi, i=l, . . ., n &nd fi{y) is not differentiable at those points where 
y,=d,ifPr{A=c/,}>0. 

However, the manner in which the derivative fails to exist is very simple, so algorithms that 
require a differentiable objective function should be adequate for this problem. For the inventory 
applications considered here, y must be composed of integers to be physically meaningful. Since 
we do not know of an algorithm that can solve such a problem, we have chosen to ignore the integer 
aspects of y. Thus, the usual caveats about solving integer problems as noninteger ones apply. 



TWO-ECHELON INVENTORY MODEL 11 

3. APPLICATIONS 

In this section we discuss how the general model can be used in several appHcations and find 
the optimal solution for some of them. To keep the characterizations relatively simple, we assume 
that the c.d.f. of />< is of the form 

y 
9i{x)dx y<0 



(25) Fr{D,<y} 



I. 

j gi(x)dx+pi y=0 

gi{x)dx+Pi-\- I Jiix)dx y>0; 



I.e., 

but otherwise Dj is a continuous random variable with a density function. This assumption allows 
us to model both high and low volume products. 

In deriving the characterizations, the following Lemmas are useful. 

LEMMA 1 : For a random variable D, 

^E[D-yV=-Pr{D>y} 

^E[D-y]^=-Pr{D>y}, 

where d^ and d" denote right and left derivatives respectively. 

PROOF: The Lemma follows directly by iaterchanging the derivative and expectation 
operators. Q.E.D. 

The second Lemma is due to Allen [1, Theorem 1, p. 339] and we will use it extensively. 

LEMMA 2: (Allen) Let F{y) be a real, convex differentiable function in n-dimensional Euclid- 
ean space. Let li{Ui) be a minimum (maximum) of the i-th coordinate j/tif t/j is bounded from below 
(above). Let R be the set of all points so restricted. Then a necessary and sufficient condition that a 
point y" maximize F over R is that either 

.o^Z..and^>0 

y,=Ui and —^<0 
is satisfied for 1=1, 2, . . ., n. 

Allen also describes an algorithm for finding y", so a method for obtaining numerical solutions 
already exists. 



or 



or 



12 B. HOADLEY AND D. P. HEYMAN 

3.1. The Stock Control Problem 

Consider a regional distribution system consisting of a material management center (MMC) 
(the first echelon) and several service centers (the second echelon). The stocks owned by the region 
are purchased through the MMC from the factory and are allocated to all facilities in the region. 
In the case of a product where growth is taking place at all service centers, it is appropriate to 
assume that Pr{Z)<>0} = l and that there is no need to return material to the MMC, transship 
material between service centers or to dispose of material on a regular basis; hence, we also assume 
that yi'>Xi, i=l, . . ., n. Also, since regular shipments are cheaper than expedited shipments, and 
the average distance between service centers is less than the average distance between the MMC 
and the service centers, we assume that C's< C*7.< C*s<s. This is a special case of either Problem I 
or II which are the same in this case hec&nse f2{y) =fziy) • To avoid unnecessary complication, it is 
also assumed that h=ho. 

Note that the assumption C*r< C*s amounts to assuming that when a demand cannot be satis- 
fied by the local servi-ce center, an expedited transshipment is made from another service center if 
possible. 

The multiperiod extension of this problem is solved in Section 4. 

From the general theory, the problem is to find a vector 

f=iyo',y^',...,yn') 
which minimizes 

(26) F:{y) = C^ (± y,-x)-i-h l± y-± E{D,)~\+ih+s-C*s)El± D,-± yJ 
\i=o / L!=o j=i J Li=i !=o J 

+ Cs S {y^-x,) + C*rE^± {D,-yd+l^+(C*s-C*r)E^± (D^-ydJ, 

subject to 

yo>o 

yt>Xt i=l, . . . ,n 

(27) i:yi>x. 

According to Lemma 2, to find y°, we need only construct a vector for which either 

(28a) 3^=Oor!/o«=OandqM>o 

oyo oyo 

and for which either 

(28b) «^_0„r!„-=x.and«^>0 

dyi '^ dyt 

ior i—1, . . ., n, and for which 

(28c) i:y%>x. 

i=l 

To construct such a vector, first note that 

^^(»Oiff 
dyo 



TWO-ECHELON INVENTORY MODEL 13 

and 

^=(»0 i=l,...,niS 

(29b) TT{D,>y,} = «)G{y), 

where 

G(y)=^C^+h-(h+s-C*s) Pr {g:Z?,>g:y,}+Cs-(C*s-CMPr {Z: A> g2/i)]/c 

To keep the analysis relatively simple, we assume that s'^Cp-\-Cs* so that 

(Cp+h)/{h+s-C*s)<l, 
and assume that Y defined by 
(30a) Fr\±D,>Y]- ^"^^ 



T ■ 



i=\ J h-\-s—C*s 



is greater than X. 

Now for 0<€<1 and 1=1, . . .,7i, define 



X, if Pr{A>xa<« 

a solution to +v, • ^ 

Vv{D;>y,}^e otherwise, 



(30b) y,{e)^ 

and in the latter case yiie)'^Xi, and define 

(30c) yo{e)=^Y-±yMj, 

Rie)=G{yie)). 
Note that either 

Y=±yM 

i=0 

or 2/o(€) = and 

Y<±yr{e); 

i=0 

hence, t/(€) satisfies (28a) and (28c) for any e. 



Cp+h+Cs 



Now, 

R{Q) = 

Cs-{C*s-C*r) Pr I X: 0,> S X, 

m)= j,^ ^^^<i, 

and R{i) is a nonincreasing function of e; therefore, R{e) = e has a solution, e*, and y^=y{i*) satis- 
fies (28). 



14 B. HOADLEY AND D. P. HEYMAN 

To interpret the results, about the only thing that can be said in general is that all second 
echelon facilities that receive stock will experience the same shortage probability, and if some stock 
is initially allocated to the first echelon, then the sj'^stem shortage probability is 

[Cp+h]/ih+sC*s]. 

The first conclusion is the same as a result of Simpson [12, Theorem 2, p. 801] and extends that result 
to the more general situation where transshipments and expedited movements are possible. 

3.2. The Returns Problem 

A problem that arises when net demand at some second echelon facilities may be negative is: 
how many items should these facilities return to the first echelon? We shall call this the returns 
problem. 

The returns problem is the special case of Problem III where Ct^Cr-\-Cs (hence T'=0) and 
Cs—Cs*<CCt* (hence regular shipments will be zero). To avoid unnecessary complications, we 
assume that h^ho and that 

n 
i=Q 

is given and equal to 

n 

Thus we seek a vector y—(yi, . . .,yn) to minimize 
(31) 

1=1 Li=l J Li=l i=l J 

subject to 

(32) 0<2/,<x„t = l,2, . . . ,n. 

An algorithm for solving exactly this type of problem is given in [11]. We shall restrict ourselves 
to characterizing the optimal solution. 

Since F2 is a convex function, Lemma 2 asserts that y° is an optimal point if and only if either 

(33a) ^^=0, 0<y.°<x, 



dyi 



or 



(33b) ^!^)>Oandv.°=0, 

dyt 



or 



(33c) ^:|W)<0and2/.°=x„ 



dyt 



holds for i=l, 2, . . ., n. 



r 



TWO-ECHELON INVENTORY MODEL 

By interchangiBg the derivative and expectation operators we get 

(34a) ^ E [S {D,-y,r-x,-±, (x<-yO J=^ ^ [g {D,-y,y-x,~± {x,-y,)J 

=Pr(D,<2/„i:(A-y*)+>J' 



15 



i=l 



(34b) ^ ^ [i: (A-y.O-^-xo-g (x,-y<) J=Pr 



where 



whenever 0<y<<a;i. 

Recalling that F(=(Z?i— yi)+ and 



-y,) "'=Pr{A<yi,|:(Z>i-yO+>yoj 


a+ a- 


dyt ay. 


i?=S (a;*- 


2/0 



i=l 



and combining (33) and (34) yields: y° is an optimal solution if either 

(35a) 

C% Pr{F,=0, g F,>Xo+i?)=C«+C*s [Pr {y,>0}+Pr {f,=0, g F,>Xo+i?)]' 0<y,»<a;.- 



or 




(35b) 




C 7 


Pr 


or 




(35c) 




C y 


Pr 



A<0, i:F,>Xo+i?j><7«+C*s[Pr{D,>0}+Pr 



Di<Q,^V,>x,+R^ y,'=0 



DKxi,p^V,>Xo+R^<C^+C*s^Fr{D,>x,}+-Pr^DKxutjV,>Xo+R^^' y 



t •''I- 



To help interpret (35) note that the right side of (35a) is the expected additional return plus 
expedited shipment cost associated with an additional unit returned from facility i to fax;ility 0, 
and the left side is the expected expedited transshipment cost of that same unit if it were not 
returned. 



3.3. The Transshipment Problem 

Consider a multi-service center company where orders placed on a service center can be satisfied 
by an expedited transshipment from another service center when the original service center is out 
of stock. For this system we are interested in determining if transshipments done in anticipation of 
demand are economical. For this problem there is no first echelon and the only types of regular and 
expedited movements are transshipments. From the general theory, it can be shown that the 
optimization problem is to find a vector y=(yi, . . . , y») which minimizes 



16 B. HOADLEY AND D. P. HEYMAN 

(36) F,iy) = Cri:{yi-x,r+C\E[±{D,-y,)+] 

subject to 

(37) i:yi=i:x,. 

i=l i=l 

The solution, y°, to this problem can be obtained by minimizing the Lagrangian function 

(38) i(^, x)=F3(y)-x[f:x,-i:y,J. 

subject to y>0, X>0, 



n n 



i=l 1=1 



According to Lemma 2 we need only construct y\ X" for which either 

(39a) — 1^ -0 0<y,'<x, x^Vi^KJ^Xj 

or 

a+z(/, x") 

(39b) — §r~^-° y'°=^ 

or 

d-Z(/, X") d+L(y°, X") 

or 

(39d) '; <0 ?/.°=Z:2:,. 

These conditions imply 

(40a) Pr{A>j/,°}=.-^ 0<y,°<x, 

or 

(40b) Pr[D,>y<^}==^^±f^ xKy.'<i:x, 

^ T . j=l 

or 

(40c) Pr{D,>0}<-^ y°=0 

or 

(40d) >^°_<Pr{Z),>iJ<^^ 2/.°=^. 



TWO-ECHELON INVENTORY MODEL 17 

or 

(40e) ^jP^<Pr{D,>yn yP=±Xj. 

The Lagrange multiplier X" can be interpreted as the "cost" of using up one of the items 
available for transshipment. Rewrite (40b) as 

and observe that for the last item received at i, the left side is the expected savings in expedited 
transshipments and the right side is the cost of placing them at locations i. For those locations 
which tranship some of their stock, the same interpretation holds with Ct—0 (because the cost 
of transshipping to themselves is zero) which yields (40a). So all facilities which receive trans- 
shipments have the same shortage probability, which is larger than the common shortage proba- 
bility for all locations which transship some but not all of their stock. 
Another conclusion that follows from these conditions is that if 



Pr{I><>z<}< 



L/x 



T 



for all i, then X''=0, y°=(xi, . . ., x„); i.e., nothing is transshipped. 

4. MULTIPERIOD EXTENSION 

The multiperiod extensions of our model using either the discounted total expected cost criterion 
or the expected cost per period criterion are straightforward to formulate and in general difficult 
to solve. However, there is one situation where a formulation of the multiperiod problem with 
the expected cost per period criterion is tractable. This is when the solution to the one period 
problem does not depend onx=(xo, Xi, . . ., a:„); because then it is reasonable in the multiperiod 
problem (it may even be optimal) to restrict attention to policies that do not depend on x. 

The policy is specified by a vector y'>0 and the cost rate for an infinite horizon can be in terms 
of y, the vector of inventory levels just before the demands occur, and then minimized to find the 
optimal value. To illustrate this we consider the infinite horizon extension of the stock control 
problem. 

Using the policy y let C{y) be the asymptotic cost rate, L(y) be the expected holding plus 
shortage cost rate for the system, P(y) be the expected cost rate of purchases, M{y) be the expected 
cost rate for regular movements and M*(y) be the expected cost rate for expedited movements. 
We assume that all demand processes are stationary and all cost factors are constant with time. 
The assumptions made in section 3.1 are invoked here also. 

Clearly 

(41) Ciy)=Liy)+P(y)+M{y)+M*(y). 

From the nature of the policy we find 

(42a) L{y)=sE(± D,-± yX+hE(± y,-± dX> 

\i=l 1=0 / \i=0 1=1 / 



18 B. HOADLEY AND D. P. HEYMAN 

and 

(42b) Piy) = Cpj:E(D,) 

i = l 

immediately. Since the only regular movements are shipments and in each period the number 
shipped to the second echelon will be just enough to replace the stock depletion from demand in 
the last period, 

(42c) M(y) = CsE^Mm(±D„ ±y^y=CsE^±D,-^± {D,-y,)J^. 

Since C*r<Cs* we obtain from (9), (14) and (23) 

(42d) M*(y)^C*rE(^± V-Vo^+iC*s-C*r)E(S*) 

= C*rE[± {D-y,)-'-(±D,-±yX'\ 

+ (C%-C*r)\E]^t^^{D-y,)J-E^±D,-±y^''^- 

Substituting (42a) -(42d) into (41) and collecting terms yields 

C(y) = {Cp+Cs-h) i: E{D,)+h ± y^+(h+S-C*s) e[± D,-± yX+C*rE[±, iD,-y,)+'] 

(43) +iC*s-C*r-Cs)E^± iD,-ydJ- 

Again according to Lemma 2, to find y", we need only construct a vector for which either 

(44) 3CV)^o„,^.„o,„,a:cOT 

dyt ^' dyi ^ ' 

for i=0, 1, . . ., n. The vector y° may be constructed by using the methods in section 3.1 ; we omit 
the details. 

The solution is different from the single period solution, but has the same characteristics. One 
of these characteristics is that the probability of shortage at a second echelon facility can be quite 
large. This is because once an item has been shipped to the second echelon, the only way to move it 
to a different second echelon facility is by an expensive expedited transshipment. When the item 
is retained at the first echelon and is not needed during the period in question, it may be subse- 
quently shipped by the cheaper mode to a more needy second echelon facility. 

Unfortunately, the above approach does not work in general. For example, the one period 
solutions of the returns problem and the transshipment problem discussed in Section 3.2 and 3.3 
both depend on x and therefore the analyses cannot readily be extended to the multiperiod case. 



TWO-ECHELON INVENTORY MODEL 19 

5. ACKNOWLEDGMENT 

The authors would Uke to acknowledge the contributions made to this work by Alan Rolfe, 
who formulated a version of the returns problem which helped lead to the general problem pre- 
sented here. 

REFERENCES 

[1] Allen, S. G., "Redistribution of Total Stock over Several User Locations," Naval Research 
Logistics Quarterly, 5, 51-59 (1958). 

[2] Bessler, S., and A. F. Veinott Jr., "Optimal Policy for a Dynamic Multi-Echelon Inventory 
Problem," Naval Research Logistics Quarterly, 18, 355-389 (1966). 

[3] Clark, A. J., "An Informal Survey of Multi-Echelon Inventory Theory," Naval Research 
Logistics Quarterly, 19, 621-650 (1972). 

[4] and H. Scarf, "Optimal Policies for a Multi-Echelon Inventory Problem," Manage- 
ment Science, 6, 475-490 (1960). 

[5] Das, Chandrasekhar, "Supply and Redistribution Rules for Two-Location Inventory Sys- 
tems: One-Period Analysis" Management Science, 21, 765-776 (1975). 

[6] Eaves, B. C, and R. Saigal, "Homotopies for Computing Fixed Points in Unbounded Re- 
gions," Mathematical Programming, 3, 225-237 (1972). 

[7] Fiacco, A. V., and G. P. McCormick, Nonlinear Programming: Sequential Unconstrained 
Minimization Techniques (Wiley, New York, 1968). 

[8] Gross, D., "Centralized Inventory Control in Multilocation Supply Systems," Chapter 3 in 
Multistage Inventory Models and Techniques, H. E. Scarf et al. eds. (Stanford University 
Press, Stanford, California, 1963). 

[9] Fukuda, Y., "Optimal Disposal Policies," Naval Research Logistics Quarterly, 8, 221-227 
(1961). 
[10] Krishnan, K. S., and V. R. K. Rao, "Inventory Control in N Warehouses," The Journal of 

Industrial Engineering, 16, 212-215 (1965). 
[11] Mifflin, Robert, "A Nonderivative Algorithm for Minimization of a Function of Several 
Bounded Variables," paper presented at the joint O.R.S.A. T.I. M.S. Meeting (October, 
1974). 
[12] Simpson, K. E., Jr., "A Theory of Allocation of Stocks to Warehouses," Operations Research, 

7, 797-805 (1959). 
[13] Zangwill, W. I., Nonlinear Programming (Prentice-Hall, Englewood Cliffs, 1969). 



OPTIMAL REJECT ALLOWANCE WITH CONSTANT MARGINAL 

PRODUCTION EFFICIENCY t 



Avraham Beja 

The Leon Recanati Graduate School of Business Administration 

Tel-Aviv University 

Tel-Aviv, Israel 



ABSTRACT 



A job shop must fulfill an order for N good items. Production is conducted in 
"lots," and the number of good items in a lot can be accurately determined only 
after production of that lot is completed. If the number of good items falls short of 
the outstanding order, the shop must produce further lots, as necessary. 

Processes with "constant marginal production efficiency" are investigated. The 
revealed structure aUows efficient exact computation of optimal policy. The result- 
ing minimal cost exhibits a consistent (but not universal) pattern whereby higher 
quality of production is advantageous even at proportionately higher marginal 
cost. 



I. INTRODUCTION AND SUMMARY 

Consider a job shop with an outstanding order for N items. Production is conducted in "runs" 
or "lots"* involving a set up cost and direct production cost. Some of the items in any lot may be 
defective, and the number of good items can be accurately determined only after the production run 
is completed. If the number of good items produced falls short of the outstanding order, the shop 
must initiate further runs as necessary. The operational problem, then, is to determine the optimal 
lot size that minimizes total costs incurred to fulfill the order. 

We tend to expect that under "reasonable" conditions the optimal lot size will be larger than 
the outstanding order to allow for the anticipated rejects, hence the traditional terminology of 
"reject allowance" or "shrinkage allowance" associated with this problem, f 

Our analysis of the reject allowance problem includes questions of structure and of computation. 
The study of structure concerns the qualitative properties (monotonicity, unimodality) of the 
relevant functions such as the minimal cost and the optimal policy. Computationally, the focus 



JThis study was supported in part by the Israel Institute for Business Research, University Campus, Ramat 
Aviv, Tel Aviv, Israel. 

*The terms run and lot will be used interchangeably. 

fSome conditions for this to be indeed true will become clear through subsequent results. 

21 



22 A, BEJA 

is on efficient methods for calculating the optimal decision in any given problem. The two apsects 
are naturally interrelated; understanding the structure allows more efficient computation, and 
computational algorithms provide insight on structure. 

The computational aspect of the reject allowance problem has attracted considerable interest 
(e.g. [2, 3, 4, 5, 7, 8, 9]). The analysis usually tended to concentrate on the search for approxima- 
tions to the cost function that would facilitate the computation. The suggested ("optimal") produc- 
tion decision would then be the point that satisfies the usual first order conditions for the surrogate (or 
approximate) cost function. Before a solution based on an analysis of this kind can be safely em- 
ployed, however, some quantitative measures of the degree of approximation must be established. 
Otherwise, there is a very little assurance that the suggested solution is in any meaningful sense 
"close" to the optimal lot size or the minimal cost. Unfortunately, no such measures are given in 
the literature. For example, Hillier [5] suggests an approximation for the first differences of the cost 
function in a special production process. Even if the approximation is assy mp to tic ally valid, no 
general assessment of the error is available for finite order sizes. Some writers (e.g. [2, 7, 8]) assume 
that all costs beyond that of the first production run and one eventual additional set up cost may be 
ignored. This is naturally restricted to cases with relatively high set-up cost, but even then it is not 
at all clear that production runs beyond the first one should be either few in number or small in 
size. Generally, the basic difficulty with all these approximations is that the analysis of the structure 
of optimal policy is based not only on assumptions regarding the problem's parameters, but also on 
assumptions regarding the optimal policy. 

The present study is devoted to the analysis of a class of production processes characterized 
by a property called "constant marginal efficiency." This includes as a special case the widely con- 
sidered process with constant marginal production cost and binomially distributed defectives (e.g. 
[2,3,5,7,8,9]). 

The computational aspects of the problem are addressed first. Here, we seek efficient methods 
for an exact solution to the problem. The model is formally presented in Section II, and it is shown 
that, in principle, an exact solution can be computed recursively by enumeration of all "relevant" 
lot sizes. It is then suggested that if a certain structure of the model is established, the range of 
computation will be drastically reduced. To achieve this, a Markovian decision model formulation 
of the problem is presented and analyzed in Section III. The direct computational implications are 
drawn in Section IV. Roughly speaking, we establish that (i) there is an optimal policy whereby 
lot sizes are strictly increasing with the outstanding order, and (ii) if for some outstanding order 
a lot of n+1 is inferior to a lot of n, larger lots need not be considered. Given these results, the 
necessary volume of computation is not only "reasonable," but indeed trivial with modern com- 
puting capacity even for fairly large order sizes. In Section V, we investigate the sensitivity of total 
cost to the quality of production. The results show that under a fairly wide range of circumstances 
(though not always) higher quality is advantageous even at a proportionately higher marginal 
production cost. Some remarks on possible extensions and applications conclude the study in 
Section VI. 

II. A STRAIGHTFORWARD FORMULATION OF THE MODEL 

Production is defined by a stochastic process [Xj, j=l, 2 . . .] where X,— 1 if the j'" item in the 
lot is good and Xj=0 if it is defective. 



OPTIMAL REJECT ALLOWANCE ^ 23 

k 

denotes the number of good items in a lot of size k. Yor j=l, 2 ... let qj=P[Xj=\] be a given tech- 
nological property of the process, assumed independent of Xi, X2, . . . Xj_i. This property may be 
equivalently represented by 



q{j,k) = P[G,=j] 
or by k 

QU,k) = P[G,^j]-=^q{i,k) 



k=l,2, ... 
j=0, 1, . ..,k 



where by elementary probability theory we have for A:>1 

qij, k)=qkq(j—l, k—\)-\-{l~qk)q(j, k-1) forj = 0, ... ^1 
(2.1) 

=0 otherwise J 

The production cost for a lot of size k is 

Y(k)=a+a+. . . . +c, 

where Co^O is a set-up cost and Cj>0 is the expected direct (marginal) cost for producing the j'" 
item in a run. hj=Cjlqj is therefore the expected cost per good item produced at the jf"" "stage" 
in a production run, and ej=l/hj can therefore be considered the production "efficiency" at the j'" 
stage. It is naturally assumed throughout that Y{k) is unbounded. 

We define three inter-related cost functions: 
F{N) is the (minimal) total expected cost to fulfill an outstanding order of A'^ items when an 

optimal policy is carried throughout. 
F{N, n) is the expected cost to fulfill an outstanding order of A^' items, if the first lot is of size n 

and all subsuequent lots are optimal. 
j{N, n) is the expected cost to fulfill an outstanding order of A'^ items if a lot of size n is produced 

whenever the outstanding order is of size A'^ and an optimal lot size is produced whenever 

the outstanding order is less than A^. 
The three functions satisfy the following relationships: 

(2.2) JiN, n) = F(n)-f-2(0, n)/(A^ n)4- zJ e(i, n)F(N-j) 

(2.3) JiN,n)=l^Y{n)+'^^qiJ,n)FiN-j)\j{l-qiO,n)] 

(2.4) FiN, n) = Y{n) + S 2(i, n)F{N-j) 

7=0 

(2.5) F{N)= min F{N,n)= min J{N,n) 

n=l, 2, . . . n=l, 2, . . . 

F(N) clearly exists, and can in principle be computed in a finite number of steps as follows : 
Start with N=l. Compute /(I, n) = Y{n)/{l-q{0, n)} as in (2.3) for n = l, 2, . . . until, say, m 
where 

Y{m)^ min f{l,n) = F{N) 

n = \ m — 1 



24 A. BEJA 

Clearly /(I, n)>^(iV) for 7i>m, and hence F{N)=F(N). Proceed similarly with A^=2, 3, . . . , at 
each stage using in (2.3) the values FiM) M=] , . . ., A^-1 computed previously. 

The practical difficulty with this procedure originates from the wide range of values of n for 
which /(A'^, n) must be evaluated at each A''. The present study helps to reduce this range for proc- 
esses with constant marginal production efficiency, i.e. hj=h for all j. This is achieved by establish- 
ing two important properties of these processes: 

(i) If a lot of n is optimal for an outstandiii g order of A^, then for an order of N-{- 1 there is an 

optimal lot r>n (monotonicity of the optimal policy) . 

(ii) If n is optimal for N &nd J{N, k-\-l)^f(N, k) then n<k. 

These properties are proved indirectly by an alternative Markovian decision model formula- 
tion of the same process.* 

III. AN ALTERNATIVE MARKOVIAN FORMULATION 

Define a Markov decision model as follows: for k=0, 1,2... N=0, 1,2..., the process is 
said to be in state {k, N) if the outstanding order is A^", set up has been achieved and k items have 
been manufactured but not inspected for defectives. The process operates in discrete time, and 
at each period the following actions are available: (i) Continue production at an immediate cost 
of Ck+i, with a transition to state (^+1, A'^) with probability 1. (ii) Stop production and inspect 
the k items already produced for the number j of good items; if j<A^ a new production run must 
be initiated, hence with probability q{j, k)j=0, . . . N—1, there is a transition to (0, N—j) 
at a cost Co, and with probability QiN, k) the process terminates.! 

An optimal policy is a decision rule that minimizes the expected sum of all costs until the 
process terminates. Let v{k, N) be the minimal expected total cost from the time the process is 
in state {k, N) until termination. v{k, N) clearly exists, and satisfies 

(3.1) vik, A^)=min [v^k, N), C^^,+v{k+\, N)} 
where 

(3.2) vKk,N)=^(i{j,k){C,+v{0,N-3)] 

v^{k, N) is the conditional minimal expected cost from state {k, N) until Termination, given that 
the process is to be inspected in the present period. 

LEMMA 3.1: «n^+l, N+l) = {l-q,^,)v\k, N+\) + q,^,v\k, N) 

PROOF: By conditioning the expected cost on Xk+i- 

Let 

H{k, N)=v'{k, N)-v'{k+\, N):i:henior N=\, 2, . . 
H{k, N)=v'(k, N)-{l-q,+,y{k, N)-q,+:vKk, N- 1) 



*Both properties are very inttiitive. Should the reader feel that this calls for a direct proof, he may find it in- 
structive to try proving any of thern directly for the simplest case where C, = C and g, = (/ for all j. 

tTo ensure termination under optimal poUcy even if C'o = 0, inspection is not allowed if Nyo and A; = 0. 



OPTIMAL REJECT ALLOWANCE, • 25 

(3.3) H{k, N)=q,+Av\k,N)-v'{k, N-l)\ 

Hik+1, N) = (l-q,+^W(k, N)-{-q,+:v\k, N-\) - (l -q,+,Wik+l, N) -q,+,v'(k+l, N-l) 
H{k+\, N) = il-q,+2)H{k, N) + q,+2Hik, N-l)-{q,+,-q,+,){v\k, N)-v'(k, N-l)} 

and substituting (3.3) 

H{k+1, iV)={i_2,^,_£^±ipi*±il H{k, N)+q,+,H(k, N-l) 

(3.4) H{k+1, N)=^ {il-q,+,)H{k, N)+q,+,H{k, N-l)] 

Let M denote the set of all states in which it is optimal to manufacture another item and / the set 
of all states in which it is optimal to stop and inspect, i.e. 

(3.5) {k, N)tM\i v(k, N) = C,+r+v{k+l, N) 

(3.6) (it, iV)€/if v{k, N)=v\k, N) 

then for (k, N)el 

v\k, N)^C,+,-\-v(k+l, N)^C,+,+v'(k+l, N) 

(3.7) H(k, N)^C,+i for all (k, N)el 



and if 

then 

thus 



v\k, N)^C,+:+v'(k+l, N) 
v'{k, N)^C^+x + v{k+l, N) and {k, N)eM, 



(3.8) H(k, N)^C,+, implies (k, N)eM 

We shall also use the following important property: 

LEMMA 3.2: v{0, N-\-l)-v{0, N)^h 

PROOF: Let r= min {k: (k, N+l)iI} (clearly r exists and r>0) Then 

KO, 7V+l)-a+ . . . +C.+v\r,N+l) 

(3.9) ^,(0, A^+1) = C.+ . . . +Cr+il-qr)v'(r-l, N+l)+qriHr-h N) 
v{0,N+l)^C\+ . . . +Cr-, + Cr+{l-qr)v{r-l, N+D+qMr-l, N) 

but by recursive application of (3.1) 

v(0,N)^Ci-\- . . . +Cr_, + vir-l,N) 

and by assumption ?;(0, iV+l) = (7,+ . . . -\-C,_i+v(r-l, N+1). Hence from (3.9) 

v{0, N+l)^Cr+il-qMO, N+l)+qMO, N) 
.v(0, N+l)-viO, N)^Crlqr = h 



26 A. BEJA 

COROLLARY 3.3: For A^=l, 2, . . . H(0, N)^C, 
PROOF: By Lemma 3.2 and (3.3). 

We now present the basis structure of the reject allowance model. 

THEOREM 3.4: There is a strictly increasing sequence <Cn*{N)^ of non-negative integers, 
such that in state {k, N) it is optimal to continue production if k<Cn*iN) and to stop and inspect 
if k>n*{N). 

PROOF: We show by induction on N that 

(i) for k<n*{N){k, N)eM and H{k, N)^C,^i 
(ii) for k^n*(N)(k, N)d and H{k, N)<C,+: 
(iii) n*(N-{-l)>n*(N). 
For 7V=0 (i) and (ii) are trivially true with n*(0)=0, because v\k, 0)=0 and H{k, 0)=0 
for all A;. For A^=l, let 

m=min {k: H(k, l)<iC^+i\ 

m exists because if H{k, l)>:Ct+i for all k then (k, l)eM for all k and v{0, l) = Ci+ . . . Cn-\-v(n, 1) 
for all n, which is impossible because v{0, 1) clearly exists and Yin) is unbounded. By Corollary 3.3 
m>0. For k^m H{k, 1)<Ca+i by induction, because if H(j, l)<Cj+i then by (3.4) 

H{j+1, l)<to (i_2.^,)c^.^^<£i±2 C,+:=C,+,. 

Let n*{l)=m. 

Assume inductively that (i) and (ii) are true for A'^. By Corollary 3.3 H(0, N-\-l)>:Ci, and we 
note that if H{k-l, N)>Ck and H{k-1, N+l)^Ck then 

hence, by induction on k H(k, N-{-\)>:Ck+i for /:=!, . . ., n*(N). Let 

r=min {k:H{k, A^+1)<C*+,} 

Existence of w(0, A^'+l) implies existence of r, and clearly r^n*{N). For /:>r H(k — 1, N)<^Ck and 
hence, again by induction on k, H{k, N-\-l)^Ck+i because if H{k — l, N+\XCk then 

H{k, iV+i)<2^{(i-2,)C,+2,C.}=^ C.=C.+i. 

Let n*(N-\-l)=r, and the theorem is proved by induction for all N. 

IV. STRUCTURE AND COMPUTATION 

The formulation of Section III is convenient for understanding the structure of the reject 
allowance model, but not necessarily for computing the optimal policy. It therefore remains to be 
seen how the results of Theorem 3.4 bear on the direct computation of optimal poHcy hy f{N, n). 
For iV=l, 2, . . . F{N) = Co+v{0, N), and by (2.4) and (3.2) 

(4.1) v\k,N) = F(N,k)-Y{k) 



OPTIMAL REJECT ALLOWANCE ^ 27 

and if n*{N)=m 

F{N) = Co+v{0, N) = a + a+ . . . C^+v\m,N) = F{N,m) 

The (strict) mono tonicity of n*{N) thus allows the evaluations oiJiN+l, n) to start with 

n=n*{N) + \. 
Now consider 

H{k,N)=[F{N,k)-Y{k)}-\F{N,k+\)-Y{k+\)]=F{N,k)-F{N,k+\) + C,+, 

so that H{k, N)>Ck+i is equivalent to F{N, k)>F{N, k+\) and H{k, NXC^+j is equivalent 
to F(N,k)<F{N,k+l). 

Theorem 3.4 therefore establishes that F(N, n) is "quasi-convex," since FiA^, n+1) r<iF{N, n) for 
n<in*(N) and FiN, n-\-l)yF(N, n) for n>:n*{N). Computationally, however, the interest is in 
i\N, n) rather than F{N, n), and one further step is necessary. 

THEOREM 4.1:/(A^, n+l)>/(.V, n) implies F{N, n+l)>FiN, n). 

PROOF: By (2.2) and (2.4) 

m, n+l)-f{N, n)^F{N, n+l)+g(0, n+l)f{N, n+1) 

-5(0, n-{-l)F{N)-FiN, n)-q{0, n)f{N, n)+q{0, n)F(N) 

UN, n+l)-f{N, n)=F{N, n+l)-F{N, n)+qiO, n+1) {/(iV, n+l)-/(iV, n)} 

-{g(0, n)-qiO, n+l)}/(iV, n)+{q{0, n)-q{0., n+l)}F(N) 

(4.3) {2(0, n)-q{0, n+l)]{J(N, n)-FiN)} 

+ {1-2(0, n+l)}UiN, n+l)-J{N, n)}=FiN, n+l)-FiN, n) 

The first term on the left hand side of (4.3) is clearly non-negative, and if the second term is positive 
the right hand side must be positive. Q.E.D. 

Note also that at n=n*(N) the right hand side of (4.3) is positive, the first term on the left 
hand side vanishes, and hence /(A^, n+l)>/(A7', n). 

Theorems 3.4 and 4.1 thus establish a very effective upper bound for the computations of 
i{N, n), which always stop at n*{N)-\-\, where f{N, n)—f{N, n—1) first becomes positive. If the 
values of qj are not extremely small there may be no need for more than just a few evaluations of 
f{N, n) for each N. Experimentation suggests that for N as high as 100 the computations need usually 
take no more than a few seconds with a large scale computer. 

V. THE EFFECTS OF QUALITY ON COST 

Returning to the general structure of the reject allowance problem, the next question to be 
considered concerns the relationship between F{N) and the system's parameters. Processes with 
constant marginal efficiency can be identified by three elements: the set up cost Co, the marginal 
eflBciency e or, equivalently, its reciprocal value h, and the sequence (q}=qu 22, • • ■ which repre- 
sents the "quality" of production, with a lower percentage of defectives for higher (2). Given (2) 



28 A. BEJA 

and Co, F{N) clearly increases with h, and, similarly, given (g) and h, F{N) increases with Co- The 
question, then, is: given Co and h, how does F{N) depend on (5)? If 6'o=0, the answer is immediate. 

THEOREM 5.1: If Co=0 then F{N)=Nh (regardless of (g)). 

PROOF: F{N)^F{M)+F{N-M) for M<N, hence F{N)^NF(1). But Fil)^fil, 1), and 
when Co = then by (2.3) /(I, l) = Ci/qi=h, hence F(N)r^Nh. By recursive applictaion of Lemma 
3.2 i^(iV) ^A^/i, and hence FiN)=Nh. Q.E.D. 

It is easy to see (e.g. by a numerical example such as the one given at the end of this section) 
that, generally, F{N) need not be independent of (g). In this section we investigate the effects of 
quality on total cost by making various comparisons at different levels of generalization between 
processes with the same (arbitrary) Co and h. The comparisons exhibit a consistent pattern whereby 
lower quality is more costly. As we shall see, however, this statement must be accepted only in the 
exact sense of the theorems proved below; a careless "universal" interpretation is false. 

One way to compare the costs associated with two production processes is by formulating a 
decision problem where, before each lot is produced, a choice is available as to which of the two 
processes to use for that lot. If an optimal policy always uses the same process, then certainly this 
process is (costwise) at least as good as the other. This approach is not only methodologically 
convenient, but also firmly rooted in the motivation underlying the analysis. 

We shall make repeated use of Howard's policy iteration principle.* For our purposes, this well 
known principle may be conveniently summarized in rough words as follows: policy / is preferred 
(strictly preferred) to policy // if, and only if, using policy / once and policy // afterwards is 
preferred {strictly preferred) to policy //. 

THEOREM 5.2: Let production process / be specified by Co, h and (q), and process II by 
Co, h and (p), where pi=qiioT i^^n and p„>sup{gt, i—n, n+1, . . .}. Then Fji{N)^Fi{N). 

PROOF: It is convenient to introduce a hypothetical third process — process — specified by 
Co, h and {q°) where q„°=0 and qi'=qi for i9^n.\ Consider a decision problem involving a choice 
among the three processes. Fii{0) = Fi{0) = F„{0). 
Assume inductively that for M=0, 1, . . AT^— 1 process // is optimal and 

Fn{M)^Fj{M), Fn{M)^Fo{M). 

For N, let a non-stationary policy A use process / for the first lot and process // for all subsequent 
lots, at a total cost of F'^^N, k) if the size of the first lot is k. Similarly, policy B uses process 
once and process // afterwards. Let r be optimal for A and m for B, i.e. 

(5.1) F^{N,r)= min F^{N,k) 

fc=i, 2, . . . 

(5.2) F^{N, m)= min F^{N, k) 

k = l, 2, . , . 

We shall show that Fu{N) ^F^{N, r) and FniN) ^F^{N, m), so that by Howard's principle process 
// is optimal for A^^. 

F^{N, m) = Yo{m)+jiqo{i, m)Fn{N-i) 

i=0 



*Cf. Howard [6] or Blackwell [1]. 

tThis is clearly just a convenient way of assigning the indices in a process which is essentially defined by C^, 
K <«)>, where Wi = 9,<' for i<n—\ w, = q, + l^ for i<n, with u),>0 for all i. 



OPTIMAL REJECT ALLOWANCE 29 

SO that if m<Cn 

F^(N, m) = Fn{N, m)^Fu{N), 
and if m=n 

F^(N, m) = Fn{N, n-l)^Fn(N). 
If m>n+l, then by (2.1) 

F^{N, m)=F„(m-l) + 2,/i+zJ{(l-2m)2o(i, m-l) + 2,2<,(t-l, 7n-l)}Fjr{N-i) 

(5.3) 

F^iN, m)=F^{N, m-l)+2„D(m-l) 

where 

(5.4) I>(m-l)-A-2 koih m-l)-2„(i-l, m-l)]FjjiN-i) 

Equation (5.2) implies that Z?(m— 1)::S0. Consider Fu{N, m— 1) as defined in (2.4) 
Fjj{N, m-l)=F„(m-l)+ Z: 5//(^, m-l)FjriN-i) 

i=0 

An immediate extension of (2.1), again by elementary probability theory, established that for k^n 

qnii, k) = {l—pn)qo{i, k)+p„qo(.i—'i., k) 



and thus 



Fjj(N, m-l) = Y,{7n-l)+p,h+^ {il-p„)qXi, m-l)+p„2„(i-l, m-l)}Fn(N-i) 

1=0 



FsriN, m-l)^F^iN, m-l)+p„Dim-l) 



and by (5.3) 



FuiN, m-l) = F^(N, m) + ipn-q„)Dim-l) 
D(m— 1):$0 and Pn^qm for m>:n+l imply that 
(5.5) F'^iN, m) ^Fi,{N, m- 1) ^Fn{N). 



Now consider 



and if r^n—l 



For r^n. 



F^(A^, r) = F,(r)+ii2,(i, T)Fn{N-i) 

1 = 



FHN,r)^F„iN,r)^FjAN). 



F^{N, r) = Yo{r) + q^+j: {{l-q,)q,{i, r) + q,q,{i-l, r)]Fn{N-i) 

i=0 

(5.6) F^(A^, r)=i^^(A^, r) + 2„I>(r) 



30 A. BEJA 

where D{r) is defined as in (5.4). Also 

Fn{N, r) = YXr)+Vnh+j:{{l-Pn)qo{h r)+p,,g,{i-\, r)]F,iN-i) 

i=0 

Fu{N,r)=F^iN,r)+p„D{r) 
and by (5.6) 

(5.7) FrjiN, r)=F^iN, r) + {p„-qn)D{r) 

If Z)(r)<0 then (5.7) and p„^5„ imply F^'iN, r)^Fii{N, r)^Fn{N), and if D{r)^Q then (5.6), 
(5.2) and (5.5) imply F-\N, r)^F^{N, r)>F^{N, m)^Fu{N). We have thus proved that 

Fn{N)^F^{N,r) 

as well as Fji{N):SiF^{N, m), process // is optimal for A^, and in particular 

FniN)^Fr{N), 

so that the proof is valid by induction for all N. 

It has already been established earlier by Lemma 3.2 that an option to obtain a good item at 
cost h separated Jrom the production process (i.e. between two production runs) always pays, since 
F{N)^F{N—l)-{-h. Theorem 5.2 allows an extension of this property, which may perhaps be 
best formulated immediately in the operational context. 

COROLLARY 5.3: An option to obtain a good item at cost h at any stage of the production 
process always pays. 

PROOF: Let the basic quality sequence of the process be (w). If the option involves perfect 
production of, say, the n'^^ item, let {q)={w} and let (p) be defined by ^„=1, Pi=Wi for ir^n, 
and by Theorem 5.2 the option pays. 

If the option allows the good item to be obtained between the n — l and the n^^ item in a run, 
let {p) and {q) be defined by pi=^qi=w, for i:<n — 1, ^«=1, 2n=0 and pi — qi=Wi-i for i=;n+l. 
Theorem 5.2 again holds. Q.E.D. 

The next theorem generalizes the "advantage of quality" presented in Theorem 5.2. 

THEOREM 5.4. Let process / be specified by Co, h, and (q), and process // by €„, h, and {p). 
If 2<>2t+i and2).^2ifori=l, 2, . . . t\ven Fj,{N)^Fj{N). 

PROOF: Define a sequence of processes, indexed by n—2, 3, . . ., with €„, h and (w") where 
w»j=^ifori=l . . . 71-1 and w"i=g, for i=n,n+l, .... Then by Theorem 5.2 7^2(iVX-fV(A^) 
and Fn+i(N):<:Fn{N) for n=2, 3, . . . . Since the range of lot sizes that are relevant for N is 
bounded, (p) is operationally equivalent to (it;") for n sufficiently large and Fri(N):<Fj{N) Q.E.D. 

Processes with non-increasing (q) as in Theorem 5.4 are analogous to the notion of "increasing 
failure rate" in reliability theory in the sense that negative effects of time and wear dominate 
production. When effects of "running in" and "learning" are dominant, non-decreasing quality 
sequences (q) are of interest. Here the "advantage of quality" take a slightly different form. 

THEOREM 5.5: Let process / be specified by Co, h, and (q), and process // by C,,, h, and 
ip). If gi^2,+i and Pi=q,+n for all i and some (natural) n then Fii{N):<Fi{N). 

PROOF: If suffices to consider n=l, because for n=2, 3, . . . the theorem then follows 
immediately by induction. 



OPTIMAL REJECT ALLOWANCE. 31 

Consider the decision problem involving a choice between the two processes. Fij{0) = FjiO), 
and we assume inductively that for M=0, . . A''—! process // is optimal and Fii{M):^Fj{M). 
Suppose that process / is optimal for A^, and that the corresponding optimal lot size is r. Let policy 
A use process // for the first lot and an optimal policy for all subsequent lots, with a total cost of 
F^{N, k) if the size of the first lot is k. By assumption 

F(A^) = F,(r)+Z: 2/(^, r)F{N-i) 

i=0 

F(A^) = r,,(r-l) + 2iA+Z; {(l-2i)2//(t, r-l) + q,qn{i-l, r-l}F(N-i) 

1=0 

(5.8) F(N) = F^{N, r-l) + q,Duir-l) 

where 



Also 



Dn{r-l)=h-j: {qnii, r-l)-qrr{i-l, r-l)}FiN-i) 

1 = 

F^{N, r)=F,,(r)+X: 2//(^, r)F{N-i) 

!=0 

F^iN,r)^Yrr{r-l) + qrh+j:{{\-qr)qrr{i,r-l) + qrqn{i-hr-l)]FiN-i) 

1 = 

(5.9) F''iN,r)=F''{N,r-l) + qrDjj{r-l) 

and by (5.8) 

(5.10) -Q FHN, r)=F{N) + (qr-q:)Du{r-l) 

UDji(r-l)^0 then by (5.8) F'^iN, r-l)^F{N) and ifDii(r-l)^0 then (since g.^^i) by (5.10) 
F'*(iV, r):<F(N). Hence by Howard's principle process // is optimal for N, Fn{N):<Fj(N) and the 
theorem proved by induction on A^ and on n. 

It is tempting to try to extend the fairly general advantage of quality of Theorem 5.4 to arbi- 
trary quality sequences (q), or, in view of Theorem 5.5, at least to non-decreasing sequences. This 
extension is false, however, as demonstrated by the following numerical example. 

AN EXAMPLE WITH DISADVANTAGE TO QUALITY: 
Let Co=h = l, {q)=0.1, 1, 1, . . ., ('p) = 0.2, 1, 1 ... . The first process involves a lower quality 
at the first stage of production, but still i^7(l)=2.1 whereas for the second "higher quality 
process" Fjril)=2.2'>Fj(l). 

VI. EXTENSIONS, APPLICATIONS AND LIMITATIONS 

The formulation of (2.2) through (2.5) does not include an explicit "salvage value" for defective 
items or for good ones produced in excess of the outstanding order. Nevertheless, if items of both 
kinds have the same salvage value, the process still falls within the scope of our model by a straight- 
forward transformation. Let the index s refer to a process with salvage value S, then 

(6.1) F,iN,n) = Ysin)+'Sq{i,n){F,{N-i)-{n-i)S}-Q{N,n)in-N)S 

i=0 



32 A. BEJA 

define 

Fr(N,n)=F,(N,n)-NS 

YT(n) = YM-nS (or equivalently Cj''=C/-S) 
and then (6.1) becomes 

FriN,n)+NS==Yr(n)+nS+j:q{i,n){Fr{N-i) + {N-i)S-(n-i)S}-Q{N,n)(n-N)S 

Fr{N,n)+NS=Yrin)+nS-\-ti^ii'^)FT{N-i) + {l~Q(N,n)}iN-n)S-Q{N,n){n-N)S 

«=o 

(6.2) Fr{N, n)=Yr{n)-\-^q(i, n)FriN-i) 

1=0 

and (6.2), which is equivalent to (2.4), solves the problem because minimization of Ft(N, n) is 
clearly equivalent to minimization of Fs{N, n). ^^(n) is the "excess production cost" of a lot of n 
items (beyond its salvage value) and it should be emphasized that the transformation is useful 
only if "constant marginal efficiency" applies to the excess marginal cost C;^. Ft(N) is the excess 
cost of an order for A^ good items, beyond their salvage value. 

In Wadsworth and Chang's model [9] the first set up cost need not equal the cost of sub- 
sequent set ups. This variation does not affect our analysis in any way, because the difference in 
cost between the first and subsequent set ups may simply be charged to F{N), with a constant 
set up cost considered throughout. 

If the technological or economic parameters change between runs, let consecutive runs be 
indexed by <=1, 2, . . .so that (3.2) becomes 

Monotonicity of nt*{N) and quasi-convexity of F^iN, n) still hold, because Theorem 3.4 is 
true for all t, as can be verified by careful inspection of the proofs involved. This interesting struc- 
tural property has little, if any, computational value, however, because Ft+x{M) must be evaluated 
for M=0, . . ., N — 1 before ji{N, n) can be computed. 

The unavoidable limitation of the model is, of course, the inherent restriction to processes 
with constant marginal efficiency. This domain naturally includes the dominantly popular q_j = q, 
and Cj=c for all j. If 5^ varies during production, the model still includes all cases where direct cost 
is proportional to the number of good times produced, rather than the total number of items. It 
can serve as a fairly good approximation when quality incentive payments are a dominant part of 
direct production cost, or when a machine's power consumption — although varying with time during 
production — strongly aifects both quality and cost, etc. Within the context of constant marginal 
eflSciency, effects of "running-in" and "learning" (increasing g;) or "fatigue" and "wear" (decreasing 
q,) are certainly allowed in unlimited variations. 

An interesting special application concerns contracting. Suppose A^^ good items are needed, and 
an agreement with a subcontractor can be secured whereby an order for the delivery of n items is 



OPTIMAL REJECT ALLOWANCE 33 

placed under the understanding that only good items are paid for, at constant price h (independent 
of n). Then if q(j, n) is the probability that out of n items delivered j items are good and Co is the 
cost of placing the order, the model applies. 

The study of more general processes, where changes in marginal cost reflect more than changes 
in 5;, is certainly of interest. It is doubtful, however, that the structure of such processes is nearly as 
powerful for the computation of an exact solution as in processes with constant marginal efficiency. 
In particular, it can be readily verified that F{N, n) need not in general be quasi-convex, and a local 
minimum does not guarantee a global minimum. For processes with decreasing marginal efficiency 
we do not even expect that n* {]SF)^n* {N — \) , nor indeed that n*{N)^N. 

With constant marginal efficiency n*(N)y'n*(N~l) insures that n*{N)>^N, so that at least 
for that class the term "reject allowance" is justified. 

REFERENCES 

[l] Blackwell, D., 'Discrete Dynamic Programming," Annals of Mathematical Statistics, SS, 

719-726 (1962). 
[2] Bowman, E. H., and R. N. Fetter, Analysis for Production Management, Revised Edition 

(Richard D. Irwin, Inc., Homewood, Illinois, 1960), 324-330. 
[3] Goode, H. P., and S. Saltzman, "Computing Optimum Shrinkage Allowances for Small Order 

Sizes," The Journal of Industrial Engineering, 57-61 (Januarj^-Februar}^, 1961). 
[4] Gregory, W. R., and A. Beged-Dov, "On the Determination of Optimal Shrinkage Allowance 

in a Job Shop," The Journal of Industrial Engineering (April, 1967). 
[5] Hillier, F. S., "Reject Allowances for Job Lot Orders," The Journal of Industrial Engineering, 

311-316 (November-December, 1963). 
[6] Howard, R. A., Dynamic Programming and Markov Processes (John Wiley and Sons, New 

York, 1960). 
[7] Levitan, R. E., 'The Optimum Reject Allowance Problem," Management Science, 6, 172-186 

(1960). 
[8] Llewell3'n, R. W., "Order Sizes for Job Lot Manufacturing," The Journal of Industrial Engineer- 
ing, 176-180 (May-June, 1959). 
[9] Wadsworth, H. M., and S. H. Chang, "The Reject Allowance Problem: An Analysis and 

Application to Job Lot Production," The Journal of Industrial Engineering, 127-132 

(May-June, 1964). 



A CHANCE-CONSTRAINED DISTRIBUTION PROBLEM 



Richard M. Reese and Andrew C. Stedry 

School of Economics and Management 
Oakland University 
Rochester, Michigan 



ABSTRACT 



The transportation model with supplies {S,) and demands (D,) treated as bounded 
variables developed by Charnes and Klingman is extended to the case where the 
Si and Dj are independently and uniformly distributed random variables. Chance 
constraints which require that demand at the jth destination will be satisfied with 
probability at least /3, and that stockout at the tth origin will occur with probability 
less than a; are imposed. Conversion of the chance constraints to their linear 
equivalents results in a transportation problem with one more row and column than 
the original with some of the new arcs capacitated. The chance-constrained formu- 
lation is extended to the transshipment problem. 



INTRODUCTION 

Developments in network models have evolved from the first attempts by Dantzig [5] to solve 
transportation models using the simplex method through the stepping stone method of Charnes 
md Cooper [9] to recent advances in solution techniques for generalized networks (Balas [1], Balas 
md Hammer [2, 3, 4,], Charnes and Kirby [11], Charnes and Raike [12]). Considerable emphasis 
las been placed on developing efficient computational techniques. Lemke's [22] dual method, 
3antzig's row-column-sum method (see [16]), Orden's [24] characterization of the transshipment 
)roblem as a transportation model, Wagner's [27] techniques for capacitated networks, Ford and 
I'ulkerson's network algorithms ([17] and particularly the out-of-kilter technique [18]) and Vogel's 
ipproximation method [25] represent significant advances. Recent developments include investiga- 
ions of efficient dual methods by Glover, Klingman and Napier [19], computation of efficient initial 
lolutions (Glover, Klingman and Napier [28], Napier [23], Glover, Karney, Klingman, Napier 
21]), and an improvement of the out-of-kilter method (Barr, Glover and Klingman [5]). 

The great strides made in computational methods for network problems in the past twenty 
'^ears, coupled with comparable advances in computer technology in the same period, have resulted 
a special purpose computer codes that can solve extremely large network problems in seconds. 
'Japier [23] reports solution times in the neighborhood of 30 seconds on a CDC 6600 for networks 
f 100 nodes (dense) and 200 nodes (non-dense) with up to 10000 arcs. Also, more recently, Ross, 

35 



36 R. M. REESE AND A. C. STEDRY 

Klingman and Napier [26] have examined the effect on the computational efficiency of various 
problem dimensions in transportation problems such as the number of variables, rectangularity, 
density, number of constraints and the variance and skewness of its objective function. 

It is reasonable to conclude that these special purpose algorithms can solve problems as large as 
will be encountered in actual networks. In an industrial application for, say, a monthly shipment 
Schedule many minutes, if not hours, might otherwise be devoted to solving a very large problem. 

STOCHASTIC DEMANDS AND SUPPLIES 

Thus far, developments in network models have been in the main deterministic. Charnes and 
Kirby [11] present a number of chance-constrained formulations which might be applied to net- 
works. Our aim here, however, is to present a special purpose technique for chance constraints 
applied to transportation or transshipment problems. Briefly, the model permits chance con- 
straints on shipments such that the probability that all demands be met at a destination is con- . 
strained below; the probability of a stockout at an origin is bounded above. I 

We shall proceed from the standard transportation model to Charnes and Klingman's [13] 
treatment of supply and demand as bounded variables. From there, the extension to the chance- 
constrained model is quite natural and results in a tableau which is structurally identical to the 
bounded variables case. 

THE CAPACITATED DISTRIBUTION MODEL 

The chance-constrained distribution model is an adaptation of the Charnes and Klingman 
[13] modification of the distribution problem to encompass upper and lower bounds on the 
requirements. 

Let Xij be the amount flowing from node i to node j, c,_^ the cost of that flow, St the total supply 
available at node i, and Dj the total demand at node j. Fir 11 y, let I={i\i=\, . . ., m] and 
«^={ili— 1> • • ■) ^}- The standard transportation problem is then: 

Minimize ]^ Xj <^ij^tj 

Subject to: 2j a;,,=(S'„ id 

(1) i:x,,=D„jeJ 

UI 

Xij>0, iel, jej 

Charnes and Klingman consider Si and Dj to be bounded variables, thus allowing some flexi- 
bility in the distribution program. In effect this permits transportation costs to determine some- 
what the requirements at the nodes. Designate Sj as the upper bound on Dj and Dj as the lower 

bound. Then Dj>Dj<Dj and, similarly, Si<Si<Si for all id, jeJ. Now, append / and J so that 

7'={i|i=l, . . ., m+1} and J'={iii=l, . . ., n+\]. 

Consider a destination node where 
(2a) ^Xij=Dj 

UI 



CHANCE-CONSTRAINED DISTRIBUTION 37 

where Dj<Dj<Dj 

Let Xm+i, j=Dj—Dj. Equation (2a) becomes 

(3) 2-i ^ij'T'Xm+l.j^^-L'j. 

and the bound conditions (2b) can be rearranged so that 

(4a) D^<D,-^0<Dj-Dj^-x„+i, , 

(4b) Dj<Dj^-Dj>-Dj^D-D^>D-Dj=Xrn+„ , 

or, combining (4a) and (4b) 

(5) 0<x„+i, j<D-Dj 

An analagous derivation holds for Si<Si<Si which results in 

(6) 0<X„ n+l<S,-Sj 

Thus the modified transportation problem is a capacitated distribution problem of the form: 
Minimize : 

ur jtj' 

Subject to: 

y^. Xij=Si iel 
UJ' 



XI Xij=Dj jej 



itV 



uV ill 

0<Xm+\.j<Dj—Dj jej 
0<Xr.„+i<S-Si id 

Xi)>Q iel,jej 
Note that the conditions on the summations of the capacitated variables assure that 

ul' UJ' 

and also provide directly the actual total flow through the system by way of x„+i. 

CHANCE-CONSTRAINED ADAPTATION 

Suppose, instead of treating the supplies and demands as bounded variables, we assume that 
the Su UI are uniformly and independently distributed random variables in the intervals [Su S,] 



n+l- 



38 E. M. REESE AND A. C. STEDRY 

and the D,, jeJ independent uniform deviates in the intervals [Z>y, D^. We now insist that all 
demands at the j"" node be met with at least probability /Sy, or 

(8) P{J:x,,>D,}>&, 

Such chance constraints are readily converted to linear inequalities for uniform deviates (cf . Chames 
Cooper and Symonds [10] and Charnes and Cooper [6]). Let x be uniformly distributed in [a, h]. Then 



b—a 



F{x)= T dy=-i 

Jo b—a -^ b—a 



so that 

becomes 

Since (8) is of the form 

we can apply (9) to yield: 
Let 



^w-f=i^' 



x>a-\-^{b—a) 

FAi:x,j)>0j 

UI 

UI — — 



So that (9) becomes 

(10) J:.x^j>Dj 

UI 

We can assume that no purpose is served by shipping more to node j than the maximum possible 
demand. Hence, 

(11) Dj<^Xij<D, 

id 

The concern at the source nodes is that there not be a stockout, i.e., that St will satisfy the 
shipping requirements from the node. 

Let at be the permitted stockout probability so that l-a^ is the desired probability that Si |i|| 
exceed the programmed shipments from node i. The chance constraint can be expressed as 

p{Si>j:x^,}>i-a, if 

^■'•^ llll 

l-^,(Z;^u)>l-a. 

i^.(z;^o)<«.- f! 

(12) T.Xij<S,+aASi-S,) f 



CHANCE-CONSTRAINED DISTRIBUTION 39 

Let 

We assume that it is undesirable to ship less than Si, the minimum availability at node i. Thus, 

(13) Si<j:Xij<S\ 

j(J 

From (11) and (13) it is clear that the linear form of the chance constraints results in a dis- 
tribution problem where 

y^, x,j and y^. X < 

are bounded variables. Hence we can, by substituting Dj for Dj and S*j f Sotj in (7), express the 
chance-constrained distribution problem as : 

Minimize : 



Subject to: 



UV jtJ' 



(14) S Xu-=D, 



UV 






where Cij^Ectj. 



UV UI 

< Xm+ i,j<Dj— Dj jiJ 

0<a;,-,„+i<S*-S( iel 
Xij>0 ielfjij 

EXTENSION TO THE TRANSSHIPMENT PROBLEM 

The conversion of the transshipment problem to a transportation model has been shown 
formally by Orden [25] so we shall proceed here by example. Our conversion differs slightly from 
Orden's in that he deals with net flows at transshipment nodes by altering either the demands or 
supplies by the net amount whUe, for our purposes, altering the demand or the supply for nodes 
at which, respectively, demands and supplies exist is more satisfactory. 

Consider the network of Figure 1(a). Node 1 is a pure source, node 4 is a pure sink while 2 and 
3 are transshipment nodes. A zero-cost feedback flow at the transshipment nodes can be intro- 
duced, as shown in Figure 1(b) without altering the problem. The network of Figure 1(b) can be 
represented as a distribution problem as shown in Figure 2(a). 



40 



R. M. REESE AND A. C. STEDRY 




a; 




(b) 



Figure 1. — A simple network. 




13 



(d) 




17 



15 



13 



(b) 



Figure 2. — Transportation model form of the simple network. 

In the form of Figure 2(a), however it is possible that the feedback flows can take on negative 
values. This can be prevented by insisting that the total flows out of node 2 (at which a demand 
exists) equal the maximum of the possible inputs to node 2, viz. the sum of the total supplies to 
the system. Similarly the inputs to node 3 (at which there is supply) are equated to the total output 
from the system. The altered distribution problem is shown in Figure 2(b) where 15, the total 
flow, is added to both the inputs to and outputs from the transshipment nodes. 



CHANCE-CONSTRAINED DISTRIBUTION 



41 



In algebraic form, the Kirchoff node conditions for the network of Figure 1 (a) are shown in 
Table 1 . Multiplying the equations for nodes at which demands exist by — 1 we obtain the relations 
in Table 2. 

We observe that the pure source (1) and the pure sink (4) are already in transportation problem 
form. We can replace the equations for the transshipment nodes (2) and (3) by pairs of equations 
whose difference yields the original equation, viz. 

Table 1. — Kirchoff Node Conditions for the Simple Network 



\Arc-^ (1,2) (1,3) (1,4) (2,2) (2,3) 


(2,4) 


(3, 2) (3, 3) 


(3,4) 


Node 










1 




10= X12 +X13 -i-Xu 


2 


— 2=— Xi2 +a;23 


+2:24 


— 2^32 


— a;34 


3 


5= — X:3 — X23 




+ 2:32 


+X34 


4 


-13= -Xu 


— 2^24 




-X34 



Table 2. — Altered Kircho_ff Node Conditions 



\Arc^ (1,2) 


(1,3) (1,4) (2,2) (2,3) (2,4) (3,2) 


(3, 3) (3, 4) 


Node 




+2^13 +2;i4 

— 2:23 — X24 4-X32 
X13 — X23 ~rX32 

+2:14 +X24 


+2:34 
+2:34 
+2:34 


1 
2 
3 
4 




10 = Xi2 

2=Xi2 
5= 
13= 



15= 

17=Xi2 

and 

15= 

20= 



X22 ~rX23 "TX24 

-f"X22 ~rX32 



2^13 



"rX23 



"T2;33 
T'2;32 +X33 +X34 



In each case we have added an equation and a new variable representing the feedback node thus 
leaving the determination of the system unchanged. The problem constraints can be represented 
in tableau form as shown in Table 3. The non-existent Arc (1, 4) is shown with an arbitrarily large 
:ost, M and the feedback nodes with cost. 

In brief, the conversion here first sets Si equal to the supply at node i for pure sources and 
Oj equal to demand for pure sinks. Then 



B= 



77? jtj 



S computed and the demand at transshipment node j is set equal to 5+min (0, D^) and the supply 
squal to B+min (0, St). 



42 



R. M. REESE AND A. C. STEDRY 

Table 3. — Simple Network in Transportation Model Form 



From — 


To 


2 


3 


4 


s, 


1 








M 


10 


2 











15 


3 











20 


D, 


17 


15 


13 


45 



( 



THE CHANCE-CONSTRAINED TRANSSHIPMENT MODEL 

Let /i be the set of pure source nodes, J3 the set of pure sinks and let I2 and I3 represent trans- 
shipment nodes which are sources and sinks, respectively, and Ji and J2 the destinations which are 
source and sink transshipment nodes. Clearly the problem : 

Minimize : 
Subject to: 



(15) 



^S§"" 


Xij 




id I 


UJ 


idz 




idi 


S ^^^-^ 


jtJi 


Z) x,,=B+D, 


jfJ2 


z; x,,=D, 

UI 


j^Js 


Xij>0 id, 


j^J 



The chance-constrained problem, by substituting B-\-Si or B+D, for S, and Dj in (8) and (12) 
as required is readily comprehended as 



Minimize : 
Subject to: 






]^i 



ii: 



z; x,s^s* 



idi 



CHANCE-CONSTRAINED DISTRIBUTION 43 



(16) 





idz 




jtJx 




jiJ2 


^X„=S, 


jfJs 


UJ' UJ 


=s:+ 






0<x„+i, y jeJi 

0<x,,^+^<S*i-St idJJh 

0<a;<,„+i idz 

Xij>0 iel, jtJ 

where, as usual, it is assumed that c<y and x<^ are independent so that ECi^Xij—Ci^Xij. 

CONCLUSIONS 

The addition of chance constraints to the transportation and transshipment models has been 
accomplished with only a trivial increase in computational requirements. In an mXn transporta- 
tion model the conversion involves the addition of m-\-n-\-l arcs where at most max (m, n) 
of these are capacitated. Thus, large scale chance-constrained distribution models can be solved 
using already available computer codes. 

RECOMMENDATIONS FOR FURTHER RESEARCH 

The chance constraints investigated here with uniform random deviates can be adapted 
readily to any density of the St and D^ provided, of course: (1) the variables are independently 
distributed; (2) the density functions are non-zero only in a finite interval (i.e., truncated above 
and below) ; and (3) the inverse cumulative distribution function can be computed or derived by 
Monte Carlo methods. The triangular and beta densities immediately present themselves as do 
empirically derived densities which are inherently truncated. 

In its present form the model belongs to the class of zero order decision rule chance-constrained 
models. Thus, further work must be done to expand the model in a dynamic context to maximizing 
over a multi-period horizon. 



44 R. M. REESE AND A. C. STEDRY 

BIBLIOGRAPHY 

Balas, Egon. "The Dual Method for the Generalized Transportation Problem," Management 

Science, 12, 555-568 (1966). 
Balas, Egon and P. L. Hammer (Ivanescu), "On the Generalized Transportation Problem," 

Management Science, 11, 188-202 (1964). 
Balas, Egon and P. L. Hammer (Ivanescu), "On the Transportation Problem — Part I," 

Cahiers du Centre d'Etudes de Recherche Operationelle, 4, No. 2 (1962). 
Balas, Egon, and P. L. Hammer (Ivanescu), "On the Transportation Problem — Part II," 

Cahiers du Centre d'Etudes de Recherche Operationelle, 4, No. 3 (1962). 
Barr, R. S., F. Glover and D. Klingman, "An Improved Version of the Out-of-Kilter Method 

and A Comparative Study of Computer Codes," Mathematical Programming, 7, 60-86 

(1974). 
Charnes, A. and W. W. Cooper, "Chance-Constrained Programming," Management Science, 

6, 7.3-79 (1959). 
Charnes, A. and W. W. Cooper, "Deterministic Equivalents for Optimizing and Satisficing 

under Chance Constraints," Operations Research, 11, 18-39 (1963). 
Charnes, A. and W. W. Cooper, Management Models and Industrial Applications of Linear 

Programming. (New York: John Wiley & Sons, Inc., 1961). 
Charnes, A. and W. W. Cooper, "The Stepping Stone Method of Explaining Linear Program- 
ming in Transportation Problems," Management Science, 1, No. 1 (1954). 
Charnes, A., W. W. Cooper, and G. H. Symonds, "Cost Horizons and Certainty Equivalents: 

An Approach to Stochastic Programming of Heating Oil," Management Science, 4, 235- 

263 (1958). 
Charnes, A. and M. Kirby, "The Dual Method and the Method of Balas and Ivanescu for the 

Transportation Model," Cahiers du Centre d'Etudes de Recherche Operationelle, 6, 

No. 1 (1964). 
Charnes, A. and M. J. L. Kirby, "Some Special P-Models in Chance-Constrained Program- 
ming," Management Science, 14, 183-195 (1967). 
Charnes, A and D. Klingman, "The Distribution Problem with Upper and Lower Bounds on 

Node Requirements," Management Science, 16, 638-642 (1970). 
Charnes, A. and W. M. Raike, "One-Pass Algorithm for Some Generalized Network Problems," 

Operations Research, U, 914-924 (1966). 
Dantzig, G. B., "Application of the Simplex Method to a Transportation Problem," in T. C. 

Koopmanns (ed.), Activity Analysis of Production and Allocation, Cowles Commission 

Monograph No. 13. (New York: John Wiley & Sons, Inc., 1951). } 

Dantzig, G. B., Linear Programming and Extensions (Princeton, N.J. : Princeton University 

Press, 1963). 
Ford, L. R., Jr., and D. R. Fulkerson, Flows in Networks (Princeton, N.J. : Princeton Uni- 
versity Press, 1962). 
Ford, L. R., Jr., and D. R. Fulkerson, "An Out-of-Kilter Method for Minimal Cost Flow 

Problems," SIAM Journal, 9, No. 1 (1961). 
Glover, F., D. Klingman and A. Napier, "An Efficient Dual Approach to Network Problems," 

Working Paper 71-57, The University of Texas, Austin, Texas (May 1971). 



I! 



CHANCE-CONSTRAINED DISTRIBUTION 45 

[20] Glover, F., D. Klingman and A. Napier, "A One-Pass Algorithm to Determine a Dual Feasible 
Basic Solution for a Class of Capacitated Generalized Networks," Center for Cybernetic 
Studies, Research Report 42, The University of Texas, Austin, Texas (October, 1970). 

[21] Glover, F., D. Karney, D. Klingman and A. Napier, "A Computation Study on Start Pro- 
cedures, Basis Change Criteria, and Solution Algorithms for Transportation Problems," 
Management Science, 20, 793-813 (1974). 

[22] Lemke, C, "The Dual Method of Solving Linear Programming Problems," Naval Research 
Logistics Quarterly, 1, No. 1 (1954). 

[23] Napier, H. A. Jr., "Some Algorithmic Procedures for Networks and their Computational 
Relationship with Existing Network Algorithms." Doctoral Dissertation, The University 
of Texas, Austin, Texas (May 1971). 

[24] Orden, A., "The Transshipment Problem," Management Science, 2, 276-285 (1956). 

[25] Reinfield, N. V. and W. R. Vogel, Mathematical Programming (Englewood Cliffs, N.J.: 
Prentice-Hall, Inc., 1958). 

[26] Ross, Terry G., D. Klingman and A. Napier, "A Computational Study of the Effects of 
Problem Dimensions on Solution Times for Transportation Problems," Journal of the 
Association for Computing Machinery, 22, 413-424 (1975). 

[27] Wagner, H. M., "On a Class of Capacitated Transportation Problems," Management Science, 
5, 304-318 (1959). 

[28] Wagner, H. M., Principles of Operations Research. (Englewood Cliffs, N.J.: Prentice-Hall, 
Inc., 1969). 



ELEMENTS OF A THEORY IN NON-CONVEX PROGRAMMING 



Claude-Alain Burdet 

SYSTEMATHICA Consulting Group Ltd. 
Pittsburgh, Pennsylvania 



ABSTRACT 

The question of necessary and sufficient optimality conditions for non-convex 
programs is analyzed in the general context of subadditivity. Several types of 
convex set extensions are investigated to generate valid inequalities from the 
corresponding gauge functions. 



1. SUMMARY 

We first present a set of "naive" optimality conditions (necessary and suflBcient) applicable to a 
general mathematical program. An example and its Kuhn-Tucker conditions are described to show 
why such necessary conditions are impractical in a non-convex situation. 

We then proceed with necessary conditions in inequality form and show how this concept is 
related to that of subadditive gauge functions (see Section 3.2). In Sections 3.3 and 3.4, we generalize 
Tuy's intersection method and describe the construction of cutting planes primarily based on the 
objective function; it is shown that Tuy's cuts are uniformly dominated in the present framework. 
Dominance is strict when the level set of the objective function is unbounded. 

We next investigate Tuy's idea of convex extension. A new formulation allows relaxation of the 
convexity assumption for the objective function and produces quasi-convex extensions which subsume 
Tuy's concept. In fact extensions can be defined in very general terms (see Section 3.6); and any 
valid inequality (including the most stringent ones) can be cast in this framework. 

Section 4 investigates an analytical characterization of the concept of extension and the pos- 
sibility to explicitly construct the corresponding cutting planes. This is obtained from convex, 
quasi-convex and/or polaroid gauge extensions. The basic thrust of our research is to obtain results 
stronger than Tuy's cuts which have proved disappointing. We use subadditive gauges on the one 
hand and extensions on the other as a vehicle to circumvent the obstacles met by Tuy's concepts 
and methodology. Although no algorithmic implication is discussed here in depth, the general 
direction of this study is aimed at solving some typical problems in the difficult area of non-convex 
optimization. 

2. OPTIMALITY 

Consider the mathematical program 
Maximize /(x) subject to xeXci?" 

47 



48 



C. BURDET 



One has the immediate necessary and sufficient optimality conditions: 



(2.1) 



xeX is optimal] 



with k=f(x) J 



iffZclevx/t 



The above optimality conditions are "naive" in the sense that they yield no constructive 
solution to the mathematical program and, in fact, they represent little more than a parody of the 
definition for global optimality. 

It is well known in the non-linear programming literature that necessary conditions for local 
optimality can be obtained from the Kuhn-Tucker theory; furthermore, under suitable convexity 
assumptions for/ and X, these conditions turn out to be sufficient, so that the question of optimality 
is fully answered. 

In the absence of convexity, the situation is quite different however: sufficiency is no longer 
guaranteed. Moreover there frequently exists an inordinate number of local optima so that explicit 
search is impractical. But the situation is worse yet (as shown in the illustrative example (Figure la) 
below) : in addition to numerous local optima, there also exists a myriad of "useless" Kuhn-Tucker 
points of the saddle type, which are not locally optimal! Thus global optimization by means of 
successive inspection of K-T points tends to be very inefficient. 

EXAMPLE: Consider the "concave programming" problem: 

M aximize /= x^ + ex 

subject to — l<Xj<l, 1=1, . . ., n 

where €< are small positive quantities -Vi. 




Optimal value: 

n-\-^u=k 
i 

Optimal Solution: 



Figure la 



tThe notations lev, afif, cl, epi, bd and conv denote the level set, affine hull, closure, epigraph, boundary set 
and convex hull respectively [8]. 



NON-CONVEX PROGRAMMING THEORY 49 

Number of Kuhn-Tucker points: X) (^) 2"-'=9 

Number of local minima (global) '■ 1^) 2°=1 (convex case) 

Number of local maxima: ( ) 2^=4 

Number of saddle points: £ (^) 2" "'=(?) 2=4 

In this case there is a Kuhn-Tucker point on each face (of every dimension ^:<n) of the hyper- 
cube X; thus the Kuhn-Tucker theory requires one to check an enormous number of points, only 
very few of which are locally optimal ; but even local optima are too numerous (2") to be checked 
explicitly as soon as n becomes reasonably large (say >20). 

3. INEQUALITIES 
3.1. Generalities 

In the search for alternatives to the Kuhn-Tucker theory, one may attempt verifying (con- 
structively) the inclusion Xc levjf /. 

This can be conceptually accomplished in many different ways which eventually amount to 
"splitting" X into several subsets with subsequent verification of the inclusion for each subset. 
Cutting planes are usually associated with a repeated "shrinking process" of the feasible set, which 
may lead to convergence difficulties; alternately another approach is described below: it is finite 
and therefore remains free of convergence problems. In practice a large number of subsets is usually 
required; but the use of cutting planes (i.e., necessary conditions) may in certain cases furnish some 
hope to efficiently curtail the search for global optimality. 

EXAMPLE: Let X be defined by the inequalities 

(3.1) g,{x)<0,k^M 

The Kuhn-Tucker conditions are known to contain a set of complementarity conditions which 
read: 

(3.2) X,^,=0 ^k^M 

This structure can now be used to set up a dichotomous arborescence which generates 
2'»+i— 1 subproblems of X according to the two possibilities: 

(3.3) a) X,= 0, g,<0 b) ^,=0, k^M 

which both imply (3.2) and are mutually exclusive. 

This "facial" decomposition methodology [5] has been applied successfully to some classes of 
problems in integer programming and in general quadratic programming; it is particularly useful 
when X is polyhedral (i.e., g^ linear). Facial decomposition is a natural extension of the idea of 
Branch and Bound, as both methods become identical when X is a hypercube. 



50 C. BURDET 

The use of cutting planes offers a possibility to discard certain subproblems from further 
explicit computations. 

Since the proposed method is similar to Branch and Bound algorithms, it possesses the same 
underlying structure of exhaustive search among candidates for the global optimum. Thus the 
process is intrinsically finite. For a polyhedral feasible set X, one obtains a tree structure which is 
completely determined by the combinatorial structure of X; each node is a program on a face of X 
(subprogram). Nodes are fathomed by a machinery of cutting planes designed to eliminate sub- 
programs (faces) known to contain no point with a better objective function value than the current 
best solution (see [5] for more details) . 

3.2. Inequalities Generated From Subadditive Gauge Functions 

We now investigate a general framework for the construction of inequalities. 
DEFINITION 1 : A subadditive gauge* function t is defined to satisfy: 
SUBADDITIVITY: ' 

(3.4a) Tr{u) -{-ir(v) >Tr(u-\-v) , ^u, V 

GAUGE: 

(3.4b) 7r(X'u) = X7r('u), 4^X>0 (positively homogeneous of degree one) 

LEMMA 1 : For gauge functions, subadditivity and convexity are equivalent. 
PROOF: 
From convexity: 

hence for 



one obtains 

i.e. subadditivity 
Conversely : 



7r(au+(l — a)v) <air{u) -{- {I — a)ir(v) 
1 

aTiu) + (l-a)Tr{v)=Tr{au)+Tr((l-a)v)>ir{aU-\-(l-a)v) Q.E.D. 



For simplicity we now assume that every point xeX can be represented by reference to a given 
finite system of vectors {ejeR"} with respect to xe Aff XcR" i.e. 

a;=x+w=z-|-X) ^A- 

Assume the "coordinates" tj to be non-negative, i.e. the system {et] spans a cone at x which 
contains X. 



*Rockafellar [8] introduced this gauge function terminology in the context of Minkowski functionals charac- 
terizing a convex body; the present concept is closely related and we use the same term to characterize a broader 
class of functions. 



NON-CONVEX PROGRAMMING THEORY 

DEFINITION 2: An inequality 

j(N 



51 



xr\ \x=xit) 



2_i TTj tj<C.Tro 



is called k-valid if the set 
(3.5) 

contains no point x such that/(a;) >k. 

THEOREM 1 : Let tt be a subadditive gauge: {Aff (X) —x}—*R; then the inequality 



is Ar-vahd, 

(3.6a) 

(3.6b) 



2^ 'fjtj^'To 
jtN 



with Trj=ir{ej), 
iro=min Tr{u) where u=^^ejtj=x—x, ^xeX 

xtX 



PROOF: By induction on n. Since ir is a gauge and i^^O, ^jtN, one has: 

7r;,^;, = 7r(e;,)i;, = 7r(e^,<y,) 



> IT ('^ejtj\=w{u) 



ri:ejt,\ 



Summing all inequalities, one obtains: 



y^.irdi>ir(u). 
J 



Now ir{u) >To, ^xeX with j{x) >k, by definition. Q.E.D. 

REMARK: Since one usually does not know in advance the optimal value k, one has to 
construct Z:-valid inequalities with k-^k, where /: is a known (or estimated) lower bound for the 
maximum k. 

EXAMPLE: Let X be characterized by the linear system 

Xi='^aijtj<Xi, ieM 



tj^O 



as customary in linear programming 
Then one may define as in (3.6a) : 



7r(e_,)=7r(ay), with e,=aj=(ai^, . . ., a<;, . . ., a„j) 
.. 7ro=min Tr{u) 

u=x—x 

f(i)>k 

X(X 



52 C. BURDET 

For instance, choose (see [3]) 



ir=^Piiu,)\Ui\ 

uM 



with parameters 



[p,+, if Wi>0 
\—p-,\iut<Q 

where pi'^ > pf for convexity of tt (hence subadditity since ir is clearly a gauge) 
i.e. 

(3.7a) iri=XlPt(ao)l««l 

(3.7b) ir„=min X1p<('?^<)I'W(| 

/(i)>t 

This introductory example exhibits several features which are characteristic of the approach 
developed here : 

a) The subadditive gauge tt is an auxiliary tool which can be chosen independently of the 
problem (i.e./ and X). Of course there may exist some practical advantages in constructing "tailor 
made" gauges which reflect the structure of/ and/or X. Indeed the program (3.6b) may be very 
difficult for arbitrary functions tt. Most of the remaining developments of this paper are dealing 
with the question of exploiting any information contained in/ and/or X to construct "appropriate" 
gauges in that respect. 

b) The coefficients ir, of the inequality may be >0, =0 or <0; thus the theory is able to gen- 
erate any type of valid inequality. 

c) The amount of computations required for the construction of the inequality lies in the 
determination of ttj and ito; the above example shows, however, that the computations for vj are 
of a different nature than for tto. Since there are n coefiicients -Kj, one will typically choose tt func- 
tions where -Kj is easily obtained (as in the example) ; the major part of the computational effort 
then consists in the determination of the coefficient tt^,. Note that a lower bound 7r(,<7ro is sufficient 
to yield a k-valid inequality. 

3.3. Subadditive Gauges Directly Related to f 

Consider the "concave programming" problem 
Maximize j{x) 
Subject to x^Zcii^" 

where/ is quasi-cont^ex. Assume that a value k (<k) and a point xe Aff X with. f{x)<k are known. 
We now construct a gauge function ir based on the convex set (lev j^f) — x: (see Figure lb) 

(3.8a) a) TT is a convex gauge, in particular 7r(0)=0 

(3.8b) b) ^Xt^x such that/(a;)=fc, let u=x—x, 



NON-CONVEX PROGRAMMmG THEORY 



53 



TT-O 



Ji+.2 -(J) 




Figure lb. — Construction of an /related gauge (see (3.8)). The case (3.8a) applies to the point x. The case (3.8b) 
applies to the top half of the illustration, including the line 2 which is tangential to the set X at x- The case 
(3.8c) applies to the lower half, including the line 1. One has 7r<7r„ within the entire set lev* /; along the 
boundary of levt /, where /(x) = A, one has 7r = jro (curved part) and ■k<-Wo (linear part). 



and set 



(3.8c) 7^(^i): 



I iTo/X* otherwise, where X*= max {/ (Xa;+(1 — X)x)<A;}, i.e. X*>1. 

0A<+<» 



Note that when/(x)<^A:, and levt/ is a bounded set the condition b) reduces to: 

■7r(u) =1^0, -Vx such that/(a;) =k; 

indeed convexity of lev^:/ implies that X=l is the unique solution of 

/(Xx+(l-X)J)=^. 

Choose a system {e^} at x such that 

■VxeX: x=x+^ ^jijt ^j>0 
J 



we now have : 
LEMMA 2: 

(3.8d) 



ir(u)<7r„, ¥w€[(levt/)— J] i.e. \ev^,ir=[{\eYj) —x] 



PROOF: Take x with/(a;)>/:. Since tt is a gauge, one has on the ray Xu, with u=x — x and 
Xe[0, 1]: 

T(\u) = \ir{u). 



M C. BURDET 

But by hypothesis y(x)<Ar; hence for some Xe[0, 1), one has/(Xit+x)=^: 

a) if X = is the only such value, then 7r(Xu) = 7r(tt) = + «> >■"■„, 4^X>0 (from 3.8b); and 
Xibd Iev^„7r. 

b) Otherwise for all X wiihfi\u+x)=k, one has \ir{u) <\*ir{u) = iro, from (3.8c); 
furthermore (3.8c), i.e. X*:<1 and/(a;)<A:, imply X*< 1. Hence one must have 7r(w)>7r. Q.E.D. 

LEMMA 3: Assume local optimality of x i.e. /(x+Xe^) </(x), 4^Xe[0, e) with e>0 sufficiently 
small. Then 

a) ir; = 7r(,/X/, with X_,*= max {f{\ej-\-x)=k\ 

o<x<+«> 

b) ir;<0 otherwise 
PROOF: a) by construction one has X^*7r; = ir(ejX;*) = 7r(,; since local optimality of a; implies that 

for some X>0, the case 7r=+ oo cannot occur. 

b) liJ{ej\j-\-x)< k, -RX^O>0, one may always set ir^=0, i.e. X^*= <» ; however non-negativity of 
the TT gauge is not assumed, nor implied by the convexity assumption (3.4). Q.E.D. 

THEOREM 2: Assume (3.8) to hold true and let x be locally optimal with/(x):<A:. 
Then 

UN 

is a k-valid inequality, where 

(3.9a) a) iTj=irJ\j*, \j*— max {f{x-\-\ej)=k} 

0<X< + o> 

(3.9b) b) 7ry<0, for the other jeA^' where (3.9a) does not apply. 

PROOF: Follows from the Lemmas 2 and 3. Q.E.D. 

REMARK: Theorem 2 remains true if x is only assumed to satisfy /(x)</:. 

COROLLARY 2.1 (Tuy) : Assume /(x)< A:. The cut 

J€N 

is /:- valid, with 

(3.10) 0<T, = 7r„/X,., 

where 

A ^ 

X= max {/(a;+Xe;)<t}< + oo. 

0<X< + co 

PROOF : We need verify the hypothesis (3.8) of Theorem 2 : (3.8b) is immediate from Lemma 3 
which contains the construction (3.10) as a special case. 

We now show that the resulting gauge ir is convex and therefore satisfies (3.8a). ir is constructed 
with convex level sets (3.8d) and is therefore quasi-convex; it is also non-negative. Lemma 4 below 
completes the proof. 

Q.E.D. 

LEMMA 4: A non-negative quasi-convex gauge w is convex. 

PROOF: We show that the set epi 7r={(7r, «)|7r>ir(u), "tb^R"} is a convex cone in i?"+'. 
Since tt is a gauge, one has ■VueR'' with 0<'n-(w)<+ oo ; Tr{u) = Tr{nv) = tJLiriv) = n 
where u=nv, and iLt>0 with t^ebd lev, tt, i.e. ir(2;) = l 
Thus all (strictly) positive values of tt are completely determined by the set leviir. 



NON-CONVEX PROGRAMMING THEORY 



55 



Since levi ir is convex, by the quasi-convexity assumption on tt, there exists (at least) one hyperplane 

HcU=l which supports levi ir at the point v, i.e. HtU>l, ^Vue levi ir and H„v=\. 

Since the gauge ir is non-negative the set epi tt can be represented as an intersection of the following 

convex sets. 

epi7r= n \{h,v)\h>xatiX {0,H^u],v,tR''\ 

» < bd ievi IT 

Q.E.D. 
3.4. Dominance 

As indicated by the above Corollary, one sees that Tuy's theory is generalized in the present 
framework : 

(i) For a given level set lev*/, one constructs in a straightforward manner the subadditive 
gauge which corresponds to Tuy's cut (see Corollary 2.1). 

(ii) Further A:-valid inequalities can be obtained, however, when lev* / contains unbounded 

rays {tjej-^x), jtN: Any convex gauge w which agrees identically with ir on the set Rj of intersected 

directions 

Ri= {ue [AS X—x]\fi\u-\-x)y'k, for some X>1 ) will yield such an inequality. 

EXAMPLE (See Figure 2):j={xy)-\ k=l/2; ir<,= l, x={2, 1) i.e. x=u+2, y=v-\-l. 

A A A 

A) u<0 or v<0: in the construction (3.8b) we must find X from [(Xw+2)(Xw+l)]~'=l/2 

A A 

i.e., \^uv-\-\{u-\-2v) = and thus 

A^ -(u+2t>) 
uv 




Figure 2a 



56 



C. BURDET 




Figure 2b. — Illustration of the construction of Tuy's cut. One intersection point is R, the other at infinity, because 
the gauge tt vanishes in u, v, >0. The point R can be determined geometrically by intersecting the ray r with 
the curve. 



Now in the region 



one has 7r=X '; 7r=+ oo otherwise. 
Thus: 



-u+y>0 



-uv 



{u+2v) 



H 



for u>0, v<0 and - u-{-v>0 
ioru<0, v>0,lu+v>0 



X (w, tj) = + 00 f or - ^i+^;<0 
7r(0, 0)=0 
B) i/.>0, tJ>0 (Tuy) ; 7r(u, «;)=0. But other gauges can be obtained as follows: 



d7r_ —V uv _j_ —v^ 

dM~u+2y"'" {u-\-2vy~ {u^2vY 



dv 



-u 



2uv 



-W 



for 
and for 



dv (u-\-2vy {u+2vy {u+2vy' 



dir —v^ 



1 



^=^^§^^#=-2''^^' 



^ dx —u^ . „ 

ov u^ 



NON-CONVEX PROGRAMMING THEORY 



57 




Figure 2c. — Cut 1 differs from Tuy's cut in that it now passes through the points S and R; in Tuy's cut the point <S 
was at 00 , while here the function . 

defines S. The polaroid gauge construction (4.8) corresponds geometrically to the envelope, within the non- 
negative quadrant u, v>0, of all tangent hyperplanes to the set epi / along the u and v axes. In the re- 
maining region of R^ with M+2y>0, the polaroid gauge ir remains identical to 

< A — uv 



u+2v 



Thus, in the non-negative quadrant, one can choose any value satisfying: 

0>ir{u,v)> max — ^u, —v\- 

The analytical derivation of this construction is based on polaroid extensions (See Section 4.2). 

One easily verifies that the functions r defined by A) and B) are all convex gauges with 
cl (levj 7r) = (levi/— x). Note that there are, in general, many possibilities to define suitable gauges 
in the region u>0, v>0; all generalize Tuy's cut; therefore it is natural to investigate the "best" 
such generalization. 

THEOREM 3 : (dominance) Assume ira>0. The ^-valid inequality 

^v,tj>iro, where tj>0 

j 

dominates uniformly any inequality 
iff aj>'Kj, -Vj. 



58 



C. BURDET 




Figure 2d. — Here the set lev^, tt is delimited by the curve. 
/=i ("above" P) 

the tangent line segment P QT and the half-line through Tx. One sees that ir=—v has disappeared and one 

now has 

uv -(648u + v) 



7r=— - U, ir — —s-' T = 

2 u-\-2v 

and 7r= + <» in the four sectors defined in the illustration. 



1075 



PROOF: Since tj>Q, one has ajtj>irjtj; 
hence 



Q.E.D. 



COROLLARY 3.1.: The inequality 

dominates Tuy's cut uniformly. 

PROOF: Immediate from (3.10) 

The "best" inequality can now be defined as one which dominates all others, i.e. 

7r(u)=inf ir(w), 
where ir satisfies (3.8). 

iF is a convex gauge because it is defined by pointwise inf in a class of convex gauges. 
In the above example w is easily seen to correspond to the choice 



for w>0, v>Q. 



k{u, w)=max \—n ^> ~y 



NON-CONVEX PROGRAMMING THEORY 



59 




Figure 2e. — Illustration of the most stringent inequality: cut 3 is vertical through P. 

We shall see in Section 4.1 that the gauge iF is the maximal convex extension of tt with respect to 
the set S=Rj. (See Remark i?l in Section 4.1 and cut 1 in Figure 2). 



3.5. Subadditive Gauges Based on f and X 

In his note [6] Tuy noticed that one could improve his cut by a conceptually simple argument. 
He characterizes a convex extension F of J with respect to -X" by: 

(i) F(x)<f(x) ^x 

(3.11) (ii) F(x)=f(x) A^x^X 

(iii) F convex 

This leads to the following "improvement" of the optimality condition (2.1): 

Xcz levj; /«e»Zc lev^ F, 

where the improvement stems from the property levj /clevj F. 

Further results in this direction can be obtained within a subadditive framework: 

(3.12a) a) TT is a convex gauge with 7r(0)=0. 

(3.12b) b) as (3.8b) where only those x which satisfy xeX are considered. 

REMARK: (3.12b) is thus a relaxation of (3.8b) and one has 
(3.12c) (cl lev., tt) n X= [(lev, /) -5] X; 



60 C. BURDET 

this r also generates a ^-valid inequality; furthermore, 

(3.13) (cl lev.^x)=3[(lev,/)-x] 

and from (3.8d), one sees that the inequality based on (3.12) will uniformly dominate the in- 
equality based on (3.8). This result also dominates Tuy's extension (3.11): indeed (3.12) is 
based on the convexity of the level sets and therefore corresponds to a quasi-convex extension rather 
than the convex extension (3.11) ; moreover we do not require f to be convex (see also [2]). 

3.6. The "Best" Subadditive Gauges 

Finally one may bring an ultimate improvement in the definition of the gauge t which will 

yield (at least conceptually) Ar-valid inequalities which dominate uniformly all the others. 

Define the set 

S.=={x=x{t)eX\^ir,tj<x„] 
J 

and impose the following conditions on tt: 

(3.14a) a) tt is a convex gauge, with 7r(0) = 

(3.14b) b) as (3.8b) where only those x which satisfy xeS-^ are considered. 

REMARK: (3.14) is a relaxation of (3.12) because S^czX. 

LEMMA 5: The gauge ir defined by (3.14) generates a A:-valid inequality. 

PROOF: Take x^X with/(x)>>^. Suppose x^S,r, then since 

x=x-\-u=x+y^,edu 

3 

one has 

7ro<7r('U) =ir(^ejtj) < ^Tfjtj 
i i 

from (3.14b); but for all x^S^, one has by construction 

^_jXj^j<ir(,; 
hence x^S^r ' Q.E.D. 

THEOREM 4 : For every Ar-valid inequality 

l>;^^>ao(>0), 
J 

there exists a convex gauge t satisfying (3.14) and 

TT = 0^0 

PROOF: Define 

Sa={x^X\jy,jtj<ao}; 

by hypothesis Sa contains no point x with /(a;)>A:. Setting iro=a<, and ir(ej) = aj (with ir(0) = 0) 
one defines a gauge tt, which is a hyperplane, and which satisfies (3.14) ; in general, there may exist 
a way to modify the above hyperplanar gauge into a convex gauge t, so that one may have the 
more general relation Trj<aj, -Vj^N Q.E.D. 



NON-CONVEX PROGRAMMING THEORY 61 

REMARK : The above theorem indicates that every supporting hyperplane of the convex hull 

conv { X ^ X\j{x) > k ] 

may be generated by a subadditive gauge tt. Thus, in principle, the subadditive gauge approach 
can produce the most stringent /: -valid inequalities. 

3.7. Example 

We now illustrate the construction of /:-valid inequalities as described in the Sections 3.4, 3.5, 
and 3.6. 

Let 

j{x, y) = {xy)-^ 

and 

X:4:X-\-y>22/3, x, y>0 
x+y>S 
x—y<l (see Figure 2). 

3.7.1. "Tuy" and "Cut 1," Choose x= (2, 1) and define the gauge tt as in the example of Section 3.4. 
One obtains the following coefficients: 

7ri=7r(-l, l) = (X)-i = l 
7r2=ir(l, 1)= max \--> — 1 =-- 



For the (A:=-j— valid inequalities, one has (7r<,= l): 



Tuy:<i>l,i.e. -^x+y>l 
Gauge tt: U—^ t2>l i.e. x—y>l 

■K is the maximal convex extension with respect to S^ (see end of Section 3.4 and Section 4.2). 

3.7,2. "Cut 2." The above cuts are based solely on levi/j/; the following gauge now takes the set 
-X" into consideration (3.12): 

I) in the halfspace - w+w>0: 
it 

lA) u<(i,Zu+v>0:Tr{u,v)^ ~^^ 



{u+2v) 

ir(0, 0)=0 

IB) (tt<0 and Zu-\-v<0) or u>0: ir{u, v)^^^ (648w+y) 

1U75 



62 C. BURDET 

II) ioT-u-\-v<C,0:ir(u,v) = -\-co 

Thus one has 





647 


TTi — 


1075 


7r2= 


1 
2 



3.7.3. "Cut 3." Finally we also show (cut 3) the inequality corresponding to the convex hull de- 
scribed in the final remark of section 3.6. 

4. GAUGE EXTENSIONS ' ' 

In this section, we apply the general methodology developed in [2] to characterize some/ and X 
related gauge functions (see Section 3.5). These gauges are described as (maximal) convex extension 
with respect to the set X of the/ related gauge functions of Section 3.4. 

It is also shown that the latter are (maximal) convex extensions of the Tuy-type gauge func- 
tions given in Corollary 2.1. Thus the concept of (maximal) convex extensions of a convex function 
with respect to a given set appears here as a fundamental tool. 

We then derive a result indicating that the maximal convex gavge extension corresponds in fact 
to a convex gauge based on the maximal quasi-convex extension oj /; it therefore dominates all 
previously known results of this type, in particular those due to Tuy [6]. 

Finally we introduce the notion of polaroid gauge junctions (see also [3]). It is shown that, when 
the level set gauge is differentiate (-t^uj^O), the polaroid approach offers an analytical method to 
determine the maximal convex gauge extension. 

4.1. Convex Gauge Extensions 

Consider the convex gauge x, defined on R^; and let S be a subset of 5", with non-empty 
interior (Int 5'?^0); 

Since tt is a convex gauge, the set epi ir is convex and it can be represented by the intersection 
of all halfspaces H which support epi tt: H{y)={{h, u)\h>HyU, ueR"} with h=Hyy—T(y) and 
ir{u) can now be represented by 

(4.1) Tr{u)=su-p {Hyu}. 

ytR" 

The maximal convex gauge extension U of tt with respect to S is constructed by relaxing (4.1) where 
only certain points y in the sup are selected : 
DEFINITION 3: 

(4.2) U{u)= sup {Hyu] 

ye Int S 

PROPERTIES: 

Pi) One has ir('u) >n(u), -Vu which implies dominance of the inequalities generated 
from TT by those stemming from 11. 
P2) n is convex 



NON-CONVEX PROGRAMMING THEORY 63 

(4.3) 

P3) n is a gauge 
(4.4) 

Prooj: n(Xw)= sup {i/yXM)=X sup {iy,ui=xn(w), forX>0 Q.E.D. 

yt Int S yt Int S 

P4) levn. n = lev^, MQCir3lev,, tt, for Tio — iro>Q where MQCir denotes the maximal 
quasi-convex extension of tt with respect to the set S (see [2] for further details on M()C functions) . 
PROOF: We need show that the set lev,„n is the maximal convex extension (with respect to S) 
of the set lev^.ir. For each y^ Int S, one has by construction: U.{y) = Tr(y) and therefore also 
n(X?/) = Tr(X2/), -P^X>0. Now if TTo is such that lev,. IT f\ Int aS=0, we simply have 11 = 0, by definition. 
Otherwise consider the collection of halfspaces 

G{y)=\ueR-\H,u>Hy) = {H,y)}. 

One has /^'~\ 

lev,.n=cl f 1 G{y), 

!/«(Int Sn lev,. I 

for iro>0 which is (by definition) the maximal convex extension of the set lev,„ tt with respect to iS" 
(see [2], Theorem 2). Q.E.D. 

P5) By construction of the level set gauge tt (3.8), one has lev,. ir=[{\eY^ f)—x], where 
1c'>j{x), XiS; thus one also has lev,. MQCTr=[(leVft MQCf)—x]; since both MQC functions are 
obtained from the maximal convex extension of the same sets; 

Thus from the above property P4, we observe that the convex gauge extension IT is the gauge 
belonging to MQCf, not/; it therefore brings about "better" optimality conditions for the program 
(2.1). 

REMARKS: There are two instances for which an extension process is called for in our 
approach : 

Rl) for the definition of the convex gauge n along rays where the function/ is unbounded 
{S=Rj) as explained at the end of Section 3.4. 

R2) for the construction of a gauge which takes the feasible region X into account 
S=X-x 

Thus one may naturally choose the following set S in Definition 3 : 

(4.5) S=[{xt-X\{x)>k}fx] 

in order to combine both aspects Rl and R2. 

4.2. Polaroid Gauges 

We now use a generalized concept of polarization introduced in [1] to construct subadditive 
gauges. Consider a function ^(u; p) : R^XP^R and a set P; assume *(•; p) to be a convex gauge 
Vp^P and define the polaroid gauge (see [1, 3]) : 

;4.6) n(u)= sup Hu;p) 

LEMMA 6: n(tt) is a convex gauge 



64 C. BURDET 

PROOF (omitted) : See Lemma 7 and 8 of [3] 
COROLLARY 6.1.: The inequality 

is /:-valid with 

(4.7) Uo— min sup <I>(u; p) 

U€S peP 

EXAMPLE: Let ^(1*) be a level set gauge defined by (3.10) and assume v difFerentiable for 
all u^O with X (u)<C -j- CO . 
Define 

(4.8) ^(.u;p)=T{u) + ip-u)Vir(u) 

Then cnoose P=S<zR'^ to obtain for each u^S: 

n(w) =sup ^{u ; p) =<J>(-u; u) — tt (u) 
pep 

Thus 

n„=min7r («)=#(,, 

utS 

the same value as in Corollary 2.1. 

Note that Uiu) is in general different from t(u) for u^S. Furthermore, one can show (see [2], 

Lemma 12) that the gauge IT constructed here is, in fact, the maximal convex extension of ir with 
respect to S. 

Note that this extension is obtained from a polarization of the level set gauge v, not of the 
objective function/. Although one has, by definition, lev^, 7r=[(lev;t /) — x,] the polaroid gauges 
derive from tt and / are quite different in general. 

REMARKS: 1) If ir{u) is not differentiable -Vv.^^O, the gradient V7r(w) can be replaced by the 
subdifferential dir{u) in (4.8) since -k is convex: i.e. 

(4.9) ^{u;p) = t{u)-\- sup {p—u)^ir 

Vir«J)ir(U) 

2) In principle, the determination of the coefficients n^, and n<, are obtained from an auxiliary 
program i.e. 

(4.10a) nj=n(ej) = sup$(e^;p) 

pel' j 

(4.10b) ^<,= minsup'l>/]S^;^^;2'^ 

uiS p(P \ j J 

with 

u='^ejtj 
j 

f 



NON-CONVEX PROGRAMMING THEORY 65 

The nature and degree of difficulty of these programs (4.10) largely depend upon the set P 
ind the form of the function * 

Since there are n coefficients 11; one will try to keep (4.10a) as simple as possible. 

EXAMPLES. In the following two examples, the program (4.10a) reduces to the explicit 
determination of the largest number in a finite list: 

1) P finite = {pu P2, • • •, Pp} 

.e. $(u;p) = $p(w),:P^P 

[ 2) P may also be a list of finite index sets 

P={A, A, . . .Pr] 

dth the following convex gauges ^rtT: 

^r{u;p),P^Pr 

' or instance 

PtPr 

nd the corresponding polaroid gauge becomes : 

n(t() = max y^, #„ iu) . 

TtT P(P, 

We may now analyse Example 3.7 and Figure 2. Starting from Tuy's cut (and the corresponding 
ivel set gauge which has Tr{u, v) = 0,-¥u, v>0 
lid 

le cut 1 is obtained by polarization (4.8) with Si={{u, v)\u<0, or v<0}; this corresponds to 
mark Rl, the end comments of section 4.1, and to the end comments of section 3.4. 

For the corresponding polaroid gauge one has : Uiu) = + 00 in the halfspace u-\-2v<i0 ; otherwise 
the halfspace w+2u>0: 



Uiu, i;)=sup {tx+2v)[-{2v'u-\-^'v)]-- 

H<0 



or 

o<0 



- — rFT^' for u<0 or v<0 

[u-\-2v) ~ ~ 

max — -r u, — u for u, u>0 



milarly for the cut 2, with 



66 C. BTJRDET 

BIBLIOGRAPHY 

[1] Burdet, C. A., "Polaroids: A New Tool in Non-Convex and in Integer Programming," Naval 

Research Logistic Quarterly, 20, 13-24 (1973). 
[2] Burdet, C. A., "Convex and Polaroid Extensions," WP73-21,Faculty of Management Sciences, 

University of Ottawa (1973). 
[3] Burdet, C. A., "On the Algebra and Geometry of Cuts," WP 74-8, Faculty of Management 

Sciences, University of Ottawa (1974). 
[4] Burdet, C. A., "On Polaroid Intersections," Mathematical Programming in Theory and Prac- 
tice, P. Hammer and G. Zoutendijk, eds., pp. 365-387 (North Holland, 1974). 
[5] Burdet, C. A., "The Facial Decomposition Method," Operations Research Quarterly, 24, 

459-463 (1973). 
[6] Tuy, Hoang, "Concave Programming Under Linear Constraints," (Russian) Doklady Aka- 

demii Nauk SSSR, 1964. English translation in Soviet Mathematics, 1437-1440 (1964). 
[7] Johnson, E. L., "A Group Problem for Mixed Integer Programs," to appear in Mathematicalj 

Programming (1975). 
[8] Rockafellar, R. T., "Convex Analysis," (Princeton University Press, 1970). 



CONVEX AND POLAROID EXTENSIONS 



Claude-Alain Burdet 

SYSTEMATHICA Consulting Group Ltd. 
Pittsburgh, Pennsylvania 



ABSTRACT 

In an effort towards a comprehensive and unified theorj% this note presents some 
new results in the area of non-convex programming within the framework of convex 
(sets and function) anal3'sis. 

The entire study is primarily devoted to the development of useful tools for 
extreme point programs (such as concave or integer programs) . 



0. BACKGROUND 

This paper presents several new ideas in convex analysis to come to grips with non-convex 
programming problems. It lies in the mainstream of developments in the theory of polar sets [7], 
generalized polars [1] and polaroids [2] ; although some results heavily rely on concepts introduced 
in [2 and 3] the paper is self-contained. The line of research pursued here is related to the following 
general constrained optimization problem : 

Consider the non-linear program 

maximize /(a;), subject to a; ^^c: R" 
3ne has immediately the following paraphrase of the global optimality property which may be 
nterpreted as a necessary and sufficient optimality condition 

IS globally optimal] -^ 

The object of this note is to establish weaker sufficiency conditions for the case where both 
JVjf/ and X are convex. 

In his short note [5], Tuy proposed some interesting ideas which can be applied to this kind 
)f problem, particularly when X is polyhedral. This paper is an attempt to produce more powerful 
esults (both theoretically and practically) and to provide some elements of mathematical structure 
a this area. 

In a first section we introduce the concept of convex extension of a set with respect to another, 
t includes a pointwise definition, a study of the question of maximality, and an equivalent defini- 
ion by means of hyperplanes. 

67 



68 C. BURDET 

This concept is applied to define convex and quasi-convex extensions of a given convex Junction j 
with respect to a (convex) set (Sections 2 and 3) ; we also construct a maximal convex extension 
oj j, which is shown to have the same properties as the extension described geometrically by Tuy 
[5]; furthermore, a dominance theorem is established. 

Section 4 briefly reviews an application of these extensions to the non-convex programming 
problem area. 

The next section features another type of extension called -polaroid extension which possesses 
the advantage (over the previous two types of extension) of being more computationally tractable. ; 

Section 6 handles the special case where / is assumed diff erentiable ; an analytical definition 
of the maximal convex extension of / is presented ; it is also imbedded in a family of polaroid ex- 
tensions. The end of this section is devoted to a brief description of the quadratic case [1], where 
the general results derived in the previous sections assume a special form; parallel studies [1, 4 
and 8] develop the corresponding algorithmic implementations. 

We then examine in greater depth the application of the present results to non-convex pro- 
grams; it is shown how a valid cut can be constructed from an extension. 

In the second part of Section 7 a new type of cut based on a cutting polaroid is established: 
it combines cutting planes and polaroids to produce a "deep" cut where the two theories interact. 

Finally a last section indicates some further improvements which can be obtained when the 
feasible set is polyhedral. 

To conclude this introductory tour it should be pointed out that little effort is made here to 
investigate every aspect of possible interest in the theory of polaroids. We have tried instead to 
adopt a line of development which seems most promising as far as implementation of new ideas 
is concerned. This has been done with the double intent of obtaining early results of tangible 
interest and of attracting others to the development of an open area. 

1. SET DEFINITION AND PROPERTIES 

Consider a closed convex set C in IR" and a convex set D with non-empty interior (i.e 
R»=afrZ)). 

DEFINITION 1 : A convex extension E oj C with respect to D is defined to satisfy: 

a) E is convex 

b) Ez^C 

c) {En Int D)ci{C[] IntD) 
Note: b) and c) imply (^Jfl Int Z>) = (Cn Int D). 
LEMMA 1: Eis closed. 
PROOF: We show that the complement of E is open; take Xi^E, then there exists x^^E 

such that 

a;3==Xxi + (l-X)x2, X^(0, 1] 
satisfies 

^3 ^ IntD but X3 ^ C. 
C is closed, hence there exists an open ball Bixs) such that 

B(x3)c:DhntB{x3)r[C=0. 



If* 



CONVEX AND POLAROID EXTENSIONS 69 



A 



Since Xy^O, we may define the following open neighborhood of Xj 

U{x,)={x=\-'[y-(l-X)x2]\yeB(x3)}; 

and ^x^U{xi) there exists y^ Int /> with y^Chence x^E, i.e., U{xi)r[E=0. Q.E.D. 

The following example shows that E need not be unique 
EXAMPLE 1: 




DEFINITION 2 : A convex extension E is said maximal if EzdE' for any convex extension E' 
y{ C with respect to D). 

LEMMA 2 : A convex extension E is maximal iff it contains all points Xi such that 4^X2 ^ C, one 
asxs^ Int Z7=^X3^ (7, with X3=Xxi + (l — X)x2, X^[0, 1]. 

PROOF: From definition 1, one sees that every point Xi of a convex extension satisfies the 
)ove; if a convex extension E contains all such points, it contains every extension and is therefore, 
aximal. 

Conversely, we show that there always exists a convex extension which contains Xi; it then 
llows from the maximality of E that Xi ^ E. Construct the set 

V{x{)={x\x=\x+{\-\)y, \^[Q,l],y^C]. 

early V is convex; from the hypothesis one also has 

y(xi)n intz)=(7n imD. q.e.d. 

Example 1 shows that the set of all points Xi (Lemma 2) need not be convex (and therefore, 
t an extension E according to definition 1). 

THEOREM 1 (existence and uniqueness) : If {Cr\ Int D) ?^0, there exists exactly one maximal 
bension E. 

PROOF (existence) : The set Q of points Xi satisfying the conditions of Lemma 2 satisfies b) 
d c) of Definition 1 ; it remains to show convexity. 

I Let x^3 = Xx,»+(l-X)Xl^ X^ [0, 1] where x/, Xi^ ^Q. 
First we note that if the (convex) segment Xi'xi^( = S) intersects C then x,^^Q because Xi' 



70 



C. BURDET 



selves to the case where the segment S does not intersect C. Construct the set V{xi^) and V{xi^) 
(seeLemma2);define J^6c^0with J=Ma;i^+(l-M)2/, m€(0, 1) for anyi/^Cn IntZ);itis sufficient 
to show y ^ Int D, implying xi^ ^ Q. 

One has y^ cl [V{x,')-C]n el [V(x,')-C], because SnO=&; but [V{x,*)-C]r[ Int Z)=-0, 
hence also cl [V(xi')-C] fl Int D=0 and therefore, ^^ Int D. 

(Uniqueness) : For any two maximal extensions E, E' one must have EczE' and E'cE' hence E=E' . 

Q.E.D. 

Theorem 1 provides a pointwise definition of the maximal convex extension of C with respect to 
D. We now turn to an equivalent definition, by means of halfspaces. 

THEOREM 2: The maximal convex extension E oi C with respect to D is the closed intersec- 
tion 4^y G (CTl Int Z>) ?^0 of all halfspaces H{y) = [x\hyX<hQ] corresponding to hyperplanes {hyX=ho 
which support C at y (i.e., ko = h^y and hyX<ho, -Vx^ C) : 

E=C\ n H{y) 

ygCn IntD 

PROOF: E clearly contains C and is convex; furthermore, {E{] IntZ)) = (Cn IntZ)). Hence 
E\^ & convex extension. We next show that E contains every point Xi (see Lemma 2) ; take any Xi, 
i.e., 4^X2 ^ Cone has X3 ^ Int D=>X3 ^ C. Suppose there exists a supporting hyperplane at x'2 ^ CD Int D 
separating x, from C; then there exists no x'3= Xxi+ (1 — X)x'2, X ^ [0, 1] with x'3 ^ C; however, there 
is an op8n ball B{x'2)cD so that x'3^5(x'2); this contradicts the property of Xi stipulated in 
Lemma 2. Thus, there is no such separating support and Xi^E. Maximality of the set of points 
Xi (Lemma 2) now implies the reverse inclusion; hence both definitions characterize the same 
set E. Q.E.D. 

Note that we have to define E as the closure of this intersection of halfspaces because, in general, 
this intersection may not be closed as shown in Example 2. 

EXAMPLE 2 




CONVEX AND POLAROID EXTENSIONS 71 

It may also be useful to remark that this concept of extension of a set with respect to another 
can be generalized to non-convex sets. 

2. CONVEX EXTENSIONS 

Consider now the real-valued function /; let X be a closed convex subset of R" and assume 
Xci dom/ and Int X9^9. 

LEMMA 3: Consider a convex set Ac(IR"X R) and define 

j{x)= inf (r, +oo) 

(I, r)fA 

then / is convex. 

PROOF: Take x=Xa;i + (l-X)x2. We show/(x)<X/(a;,) + (l-X)/(x2), X^[0, 1]. By convexity 

of A one has : 

/(z)= inf r<Xri-|-(l— X)r2 for any n, r^ 

(I, r)«A 

such that (xi, rO and (x2, r^ ^ A; hence, also for/(xi)=inf r^ and/(x2)=inf r-t. Q.E.D. 

The converse of Lemma 3 is, of course, also true i.e., the epigraph 

A=epi/={(x, r)|x^X, r^ R, r>/(x)} 
is convex. 

Define the (convex) cylinder D over the convex set X 

Z?={(x, b)\x^X, 5^R)c=R«+i 

DEFINITION 3 : A convex extension CF of the function / with respect to the (closed convex) 
set X is a function whose epigraph G is the convex extension of the set epi / with respect to D i.e., 

CF{x)=Tmn {+ <o , ^\{x, ^)^G]. 

In mathematical programming, we are really only interested in the values of / over the feasible 
set X; the next Lemma shows that/ and CF are "essentially" identical on X, i.e., that one may 
replace / by its extension CF. 

LEMMA 4 : -J^x ^ Int X, one has CF{x) =/(x). 

PROOF: For x^X one has (x, <p) ^D, ^<p^ R; furthermore, from Definition 1 

(Int D n epi/) = (Int D n epi CF) 

holds true. Hence, -J^x^Int X, 

CF(x)=min {+00, ^\{x, v)^ epi CF]=wln {+», ^|(x, <p)^ epi/}=/(x). Q.E.D. 

We now examine the discrepancies between/ and CF, i.e., the special cases where the two are 
not rigorously identical. 

LEMMA 5: The convex extension of a convex function on X is a continuous convex function. 

PROOF: Convexity follows from convexity of epi CF and Lemma 3. Furthermore, the set epi 

f^CFis closed (Lemma 1) implying continuity of the min function. Q.E.D. 

Note that Lemma 5 does not require continuity of/; and, if / is discontinuous, one may have 

j{x)9^CF{x) for some x^ bd Z. 



72 C. BURDBT 

COROLLARY L5.1: If/is continuous on X then/(x) = CF(a;), ^x^X. 

PROOF : Take a; ^ bd J?; continuity of CF and/, with/(x) = CF{x) , ^x ^ Int X implies equality 
on the boundary also. Q.E.D. 

Maximality was introduced in Section 1 because, as we shall see below, it introduces in a 
natural way the most "desirable" extensions, i.e., which dominate the others and yield better suffi- 
cient conditions. If one considers the maximal convex extension in the definition 3, then the resulting 
function MCF is called maximal convex extension of/; and one has: 

THEOREM 3: For any real-valued convex function/,/' with/'(a;)=/(x), 4^a;^Xone has 

fix) >MCF(x), 4^x G IR"- 

PROOF: Maximality of epi MCF with, respect to D implies epi/'c epi MCF. Q.E.D. 

Note that -Vx ^ R" one has MCF{x)<j{x) so that MCF can only become + oo when and 
where /has the same property. 

In the special case where/ is not continuous on X (discontinuities may only occur on bd X), 
the epigraph should be replaced by its closure: cl epi/. 

The following immediate corollary relates the present construction to the concept envisaged 
byTuyin[5]: 

COROLLARY T3.1: A convex function g{x) satisfies (i, ii, iii) below iff g{x)^MCF{x). 
(i) g{x)=J{x),Aix^X 

(ii) 9{^)<m 

(iii) (maximality): g{x)<f'(x) for any convex function satisfying (i), (ii). 
PROOF: From Theorem 3 we know that MCF satisfies the above conditions (i)-(iii) ; in par- 
ticular MCF(x)<g(x) furthermore, MCF is convex, hence from (iii) one must have g{x)<MCF{X). 

Q.E.D. 
Thus, one has: The maximal convex entension of the function/ with respect to X is unique. 

3. QUASI-CONVEX EXTENSIONS 

In many ways, and particularly in the context of the optimality condition described in Sec- 
tion 0, convex extensions appear to be too restrictive; indeed only quasi-convexity is really required, 
and we now turn to the definition of the corresponding generalizations. 

DEFINITION 4:/ is a quasi-convex function if and only if its level sets lev^/= {a:|/(x)<^} 
are convex. 

DEFINITION 5: A quasi-convex extension QCF of the function / with respect to the (convex) 

set X Is a function whose level set L^ is a convex extension of lev^/ with respect to X; by definition 
one has 

QCF(x)= ini {<p, +oo\x^L^}. 

It may happen here that lev,,/ is not closed (when/ is discontinuous) ; one then uses cl lev^/ to 
define the extension. 

LEMMA Q-.A^x^ Int X, one has QCFix)=J(x). 

PROOF: By Definition 1, one has 

Int Xr\ levj=lnt Xf]L,, ^<P^j{X) 
thus, for x^ Int X one has/(x)=^<s»QC/?^(2;)=^. Q.E.D. ^ 



CONVEX AND POLAROID EXTENSIONS 73 

LEMMA 7: The quasi-convex extension of a quasi-convex function on X is a continuous 
quasi-convex function. 

PROOF: As for Lemma 5, this follows from convexity and closedness of the convex extension 
L^. Q.E.D 

Here again /is not required continuous and one may have /(a;) 9^QCF{x), for x^ h6. X (see remark 
after Lemma 5). If Z- is the maximal extension ^(p one obtains the maximal quasi-convex extension 
MQCF with the following property : 

THEOREM 4: For any real-valued quasi-convex functions/,/' with/'(x)=/(a-), A^x^X, one 
has/'(a;) >MQCF(x), ^x^R\ 

PROOF : By construction lev^ MQCF is maximal -Vip hence, 

lev^/'c lev^ MQCF must hold ^<p. 

Q.E.D. 

Clearly, since a convex extension is also quasi-convex, one may consider both extensions of a convex 

function; one then has the following uniform dominance theorem. 

COROLLARY T4.1 : lev^/clev^ MCFc^ lev^ MQCF, ¥<p 
i{x) >MCF{x) >MQCF{x), and equality holds throughout on Int X. An application of this main 
result is presented in the next section. 

In conclusion of this first part of the paper, we observe that quasi-convex extensions provide 
the proper theoretical foundations (rather than convex extensions) for the investigation of opti- 
mality conditions within the framework of convex analysis. It should also be pointed out that, 
in spite of its trivial derivation, the above dominance theorem is of great practical importance; the 
reader may also convince himself that the quasi-convex extension MQCF has a graph different 
from MCF. 

4. (QUASI) CONCAVE PROGRAMMING 

Let us 'consider our initial non-linear program (see Ssction 0): maximiz3 /(x) subject to 
x^XclR", where X is assumed (closed) convex and/ (quasi-) convex. Thus, the inclusion test 
lev^/z)X required by the sufficiency conditions deals with two convex sets and we can use the 
Uniform Dominance Theorem T4.1 to obtain better optiraality conditions than the above. We use 
the word "better" to indicate that, for whatever method used to actually perform an inclusion 
test, it is reasonably clear that such a test is more conveniently performed on Qi':dX rather than 
on Q2^X when Q^ZiQi. The following theorem shows that sufficiency is indeed maintained. 

THEOREM 5: Assume continuity of/ on X; then one has 

Xc\ev,f<^X^ lev, MCF<s^Xc:\eY, MQCF 
and 

fix) = MCF(x) = MQCF{x) ,^x^X. 

PROOF: Immediate from Lemma 5 and 7. 

REMARK 1 : The possibility of improving the sufficiency inclusion test b}^ considering a 
convex extension has first been pointed out by Hoang Tuy in [5]; the equivalence of his description 
of a convex extension with ours is shown in Theorem 3. The introduction of a quasi-convex extension 
serves the double purpose of providing j^et a better condition when/ is convex and also a possibility 
to handle the quasi-convex case in a similar manner. 



74 C. BURDET 

REMARK 2: Since the condition lev^/oX is necessary and sufficient for optimality of i^X 
(with f(x)=k), Theorem 5 describes a variety of necessary and sufficient optimality conditions; it 
should be noted however that while extensions have the property of improving (weakening) the 
sufficiency conditions, this has an adverse effect on the necessary aspect since one wants the strongest 
possible necessary condition. On the other hand, our assumptions on X and/ renders this "neces- 
sary" side of the question uninteresting because it has been thoroughly developed in the literature. 



5. POLAROID EXTENSIONS 

The definitions of (quasi-) convex extensions do not lend themselves easily to a straightforward 
algorithmic implementation because they are geometric rather than analytical. One may also 
encounter practical difficulties when / is not continuous on X. We now introduce the concept of 
polaroid extensions to remedy this situation. 

DEFINITION 6 [2]: Given ^^ R, the polaroid set X*(k) corresponding to the set X with re- 
spect to the function <p: R"XlR"— ^R is defined by 

X*(k)=^{y\^(x;y)<k,A^x^X}czR\ 

DEFINITION 7 : The function <p is said to polarize f on X if -t^a; ^ X one has/(x) =<p{x; x). 
The polarization is said proper when tp satisfies <p(a;; y) <max {/(x), f(y)]. 

THEOREM 6: If -Vx^X, the function ^(y)=<p(x; y) is quasi-convex (in y) then the polaroid 
X*{k) is convex -Pfc. 

PROOF: Omitted (see [2, 3]). 

In the sequel, we shall always assume that <p satisfies the hypothesis of Theorem 6, as we only 
make use of convex polaroids. 

We can now apply the above concepts to our present objective of constructing a (quasi-) 
convex extension of/; to this effect we now consider the set {X{\ \evkf) and its polaroid P{k): 

LEMMA 8 : Under proper polarization, the polaroid P(k) = {Xr\ lev*/) *(k) is a convex extension 
of the set lev^^ / with respect to X. 
PROOF: a) convexity follows from Theorem 6 

b) P(k):D \eY,J: onehas Pik)={y\<pix;y)<k, ^x^iXn lev,/)}; 
take y^ lev^/, 4^x^ {Xr\ lev*/) then max {/(x), J{y)]<k holds true. Hence, since the polariza- 
tion (p is proper, 

<p{x; 2/)V|max {/(x), j{y)}<k, i.e., y^P{k). 

c) Take z^X[\ lev*/, with/(2)=^; the set Q= {y\ip{z; y)<k} is convex since <p{z, y) is 
assumed quasi-convex in y ; furthermore, the following insertions hold QzDP{k) 3 lev* /3 (X D lev*/) ; 
moreover,/(2) =(p(z; z) =k implies z^hd Q and <; ^ bd lev,/. Since QzDlev^f and both sets are convex, 
there must exist a hyperplane through z, supporting both Q and lev* /; furthermore one also has 
^ bd P{k). Thus, since P{k)c:Q, the set P{k) has a common boundary with the set lev, / within the 
set X. Convexity implies P{k)r\X= lev, /(IX Q.E.D. 

COROLLARY L8. 1 : Consider the maximal extension Z, of lev,/ with respect to X; then one has 
lev, JciP{k)ciL„ and {P{k) n Int X) = (Z,n Int X) = (lev, /n Int X). 



CONVEX AND POLAROID EXTENSIONS 75 

DEFINITION 8 : A polaroid extension PF of the function/ with respect to the set X is a func- 
tion whose level sets are 

lev^PF=P{k) 

i.e., 

PF(y)=mi{+^,k\y^P(k)}. 

Note that PF depends (implicitly) on the polarization (p oij and that there are, in general, several 
polaroid extensions of the same/. One may also observe that the definition of PF does not require/ 
to be convex; PF is always quasi-convex however. Furthermore one has the following alternate 
definition of PFiy) iory^ dom PF (i.e., Pi^(2/)< + °° ) : 

PF(y)= inf {k\,pix; y)<k, ^x^Xf] lev,/} 

= sup {k\<p(x; y)>k, some x^X}; 



hence 



PF(y)=max<p(x; y) 



and the above expression can be used to compute any value of PF. 

COROLLARY L8.2 : j{x) <PF{x) <MQCF{x), ^x and equality holds throughout -Px ^ Int X. 
PROOF: Follows from Corollary L8.L 

There exists no general dominance theorem between MCF and PF; however, one obviously has 
COROLLARY L8.3: If the polaroid extension Pi^is convex then 

PFix) >MCF(x) (and PF{x)^MCF{x)=j{x), ^x^ Int X). 

PROOF: By maximality of MCF. 

In conclusion, we note that polaroid extensions may constitute a viable alternative to the 
definitions of Sections 1 and 2, particularly when the non-linear programs 



I 



maximize (p{x; y), subject to x^X 

are conveniently solved for each parameter y. Since it is necessary to produce the global optimum 
of the above program, one will naturally restrict the choice of ip to functions of the following type: 
^y, <p{x; y) pseudo-concave (in x); remember that we also assume 
4^z, <f>{x; y) quasi-convex (in y), in order to obtain convex polaroids. 
LEMMA 9: Let (p satisfy the above conditions, and X be a convex set, then the program 

max <p{x; y) 

is convex ^y, i.e., every local optimum is a global optimum. 

PROOF: Omitted. 

It seems natural, at this point, to question whether the abstract definition of a polaroid ex- 
tension is really useful, and in particular if there are cases where a concrete polarization ^ can be 
explicitly given; the following sections will now show that indeed one can easily find polarizations 
of a given function / which are both strong (in the sense of the Dominance Theorem T4.1) and 
implementable. 



76 C. BURDET 

6. DIFI'ERENTIABLE f 

In this section we present some results related to Tuy's note [5]. First we give by means of a 
polaroid extension an analytical characterization of Tuj^'s convex extension; a family of polaroid 
extensions is then constructed which satisfies a dominance property (Theorem 8). The practical 
significance of this theory is brieflj^ illustrated with a quadratic example while a more comprehensive 
description of this special case can be found in [1] for the concave and [4] for the general case. 

Define the polarization 

^{^, y)=J{x)-\-{y—^) A/(a;). 

Clearly <p{x, x)=j{x) and since <p is linear in y, Theorem 5 holds; moreover, 
LEMMA 10: If/ is quasi-convex, ^ is proper. 
PROOF: Quasi-con vexity of/ implies, in the case of differentiality: 

max {j{x) , fiy) } >j{x) -\-{y-x) v/(x) ; 

hQncQ <p{x; y) < max {j{x)J{y)]. Q.E.D. 

We now restrict ourselves to a convex function / and show that the corresponding polaroid 
extension PF is the maximal convex extension MCF. (Lemms 11 and 12). 

LEMMA 1 1 : PF is convex and PF(y) <j{y) when / is pseudo-convex. 

PROOF: We only need consider y^ dom PF; take 

y=\y'+{l-\)f 
PF{y)^ max ^{x; y)= max [X^(x; f) + {\-\)<p{x; 2/^)]=/(x)+[Xy' + (l-X)2/=^-x]v/(x) 

xtX X(X 

=X[/(x) + (?/'-5)V/(x)] + (l-X)[/(x)-f(?/='-x)v/(x)]<X max ^{x; f) + {.\-\) max ^{x; y') 

xiX 

=\PF(y') + {l-\)PFif). 
Also PF(y)= max (p(x; y)=j{x)-\-{y—x)vj{x)<j{y) by pseudo-convexity of/. 

XtX 

LEMMA 12: PF is maximal. 

PROOF: We must show that for any convex function ^( a;) ynth. g{x)=PF{x)=J{x) , ^x^X one 

has 

g{y)>PF{y),^^R\ 

Since g and / agree identically on X, g is subdifferentiable ; furthermore one has 

^j{x)^dg{x),^x^X 
where dg{x) is the subdifferential of g at x. Now the subgradient inequality gives 

g{y)>g{x) + {y-x)u, ^u^dg{x), -¥i/GR"; 

in particular for u=vf(x). 

Now given y^ R", choose x^X such that 

PFiy) = max <p(x; y) =j(x) -\- (y-x)vf(x) ; 

xcX 



CONVEX AND POLAROID EXTENSIONS' 77 

one has 

g{y) >g(x) + (2/-x)v/(i) =/(x) + (y-x)vJ{I)=PF(y) Q.E.D. 

It is interesting to note the following additional property of PF: Take y ^ ^"—X; assume x' ^X is 
such that 

<p(x' ; y) =f(x') + (y- x')vj{x') = max <p{x; y)=PF{y) . 

xtX 

Then 

PF(z)=J{x') + \(i/-x')vf(x'), A^z=Xy-{-(l-\)x', X^[0.1] 

because PF is convex and satisfies by construction 

PF(z) >^{x'; z) (since x' ^X) 

with equality holding at z=x' and z=y; i.e., the convex extension PF is linear on the segment 
:^'. 2/). 

COROLLARY L12.1 : PF=MCF . 

Note that convexity of/ is not essential here; however, if/ were not convex one could have 
PF{x)^j{x), for some x^X, since PF is convex. It should also be noted that PF is only maximal 
with respect to other convex extensions; if quasi-convex extensions are taken into consideration, 
maximality is, in general, lost, as shown by Theorem 8 below. 

The polaroid extension PF can be extended to the case where the function / is only required 
to be convex and continuous on X (i.e., only subdifferentiable instead of differentiable) ; this merely 
involves replacing the gradient by a subdiflferential. 

It is interesting to note that although PF is the maximal convex extension of a convex (differ- 
entiable) function/, it is really not the best extension which can be obtained by this type of polariza- 
tion; to illustrate this point, we now imbed PF in a family of polaroid extensions PF^ (with PFi = 
PF) 

<Pa{x; y)=j{,x) + a{y—x)VJ{x), q;>0. 

and prove the following dominance theorem. 

THEOREM 8: PF^^{x) >PF„^{x) if a,>a2. 

PROOF: We show that the corresponding polaroids satisfy Paj(t)czP«2(^)- 
Take yi ^ bd Pa^ik), and choose Xi^Xf] lev*/ such that 

<Paiixi;yi)=lc. 

Suppose 2/i ^Pa2ik) ; then there must exist a point 2:2 6^0 lev^/ such that 

<pa2ix2;yi)=k'yk. 

It follows that a2(yi — X2)vf(x2)=k'—f{x2)y'0 (sincej{x2)<k) and hence 

(yi — X2)Vfix2)>0. 

Furthermore, yi^Pa^{k) implies 

/(Xz) + ai (2/1 - ^2) V/ ( Xa) =/ (^2) + "2 (yi - 2:2) V/ ( X2) + (ai - az) (2/1 - X2) V/ ( X2) < /: 



78 C. BURDET 

hence 

{a^ — a2){yi — X2)Vj{x2)<k — k'<_0 
which implies (cci — 012X0. 

A contradiction the hypothesis ai >a2; thus, yi ^Paiik). Q.E.D. 

An Illustration : The Quadratic Case [1] 

Consider 

/(x)=cx+K xCx, 

where/ is convex, i.e., C positive semi-definite. The maximal convex extension of/ with respect to 
a feasible set X is defined by the polarization (see L12.1) 

<p(x; y)=cx+- xCx-\-cy-]-yCx—cx—xCx=—Jix)-j-[cx-\-cy-{-yCx] 

COROLLARY L12.2: The maximal convex extension of a quadratic (convex) function can be 
obtained by solving convex programs of the type 

max,p{x;y), 

where y is given and (p(x; y) is a concave function (of x). 

PROOF: Since/ is convex, — / is concave and ^= —/+ linear terms is also concave; hence, the 
program is convex. Q.E.D. 

But the familj^ of polarizations ipa{x; y) =J (x) -\- a{y— x)vf (x) , (a>0) obeys the dominance 
relation of Theorem 8. Moreover, for the quadratic (convex) case, observe that 

^1/2(3;; y)=2 [cx+cy+yCx] 

is bilinear. Thus, in this case the convex programs of Corollary L12.2 are linear programs. Since 

one has 

<Paix; y) = — (2a—l)f{x) + a[cx^cy+yCx], 

it is apparent that for any given y the function <Pa{x; y) is convex in x when 0<a<l/2; thus, a 
polaroid extension of this type "better" than PFi,2 (in the sense of Theorem 8) can only be 
obtained at the cost of solving a (non-convex) concave quadratic program in Corollar}' L12.2; this 
is of course of little practical value since such programs are often of the same degree of difficulty 
as the original concave program: max/(x), subject to x^X. However, in certain cases, where the 
polyhedral set used to define the polaroid has a simple structure (for instance, the simplical set 
S of Section 7) this can be efficient (see [4]). J 

7. POLAROIDS AND CUTTING PLANES 

As mentioned in Section 6, one of the applications of polaroid extensions is (quasi-) concave 
programming. We now indicate how cutting planes can be constructed from a given extension. 
In the second part of this section we present a new (interactive) construction of the "deepest" cut 
to be generated in the present framework. 



CONVEX AND POLAROID EXTENSIONS 79 

7.1. Cutting Planes 

The inclusion test (see Section 0) of two convex sets can often be accomplished efficiently by 
a cutting plane algorithm, particularly when the feasible set X is polyhedral. (There exist other 
types of procedures, which are also based on convex extensions and can be used in this context 
(see, for instance [1])). 

Take the current best value k=f(x) where x^X is the corresponding solution; by hypothesis 
one has x^ lev]^/ and a cutting plane is constructed to discard a (cut-off) portion Sr\X which 
cannot contain a point y better than x, i.e., Sr\Xcz Int {\eYjJ), or f{y)<Ck, -Vy^SflX where the 
set S is used to denote a cut (see [2]). (Here/ should be assumed strictly quasi-convex and lower 
semi-continuous). Cutting planes are a convenient way to define such a cut-set S which is then a 
half-space ; thus, in essence cuts are used to reduce the feasible set X and thereby facilitate the 
inclusion test. 

When X is polyhedral, one can define a simplicial cut-set S with x a vertex of X. Let x be 
obtained from an L.P. tableau of the form 

where A'^ is the non-basic index set corresponding to x. The cut-simplex S can be determined as the 
convex hull of the (n-f 1) points (see [3, 5]) [x, Uj,j^N} where il; is the intersection point of the 
ray Uj{tj)—x—tfij, tj>0 with the boundary of the set lev^f (or (better) \eviMCF, PF(k), or 
\eyjMQCF) i.e., Uj=Uj(tj), tj>0. Then one has 



S=\x=xit) S i t,<l,tj>0, ^J^n]- 

I JtN tj J 



COROLLARY T4.2: The cut-simplices generated by intersection with convex extensions 
satisfy the following uniform dominance relation: Sf(^SMCF^SMQCF 

7.2. Polaroids 

The main purpose of a polaroid extension in the construction of a cut is to insure depth and 
validity; and the following irr mediate observations come to mind: 

a) Validity is defined by x^Sr\X=^x^ lev^/ i.e., the points of X which are actually cut-ojff 
must lie in lev^t/: {Sf]X) a lev^/. 

b) Polaroids P* of a given set P can be viewed here as a tool to generate cut-sets S^ F* because 
under proper assumptions on/, one has P*[]P=P(} lev^J (Lemma 8 and corollaries). 

c) Finally, polaroids enjoy the following inclusion reversing property, (see [3]). 

Qc:P=>P*czQ* 

Thus we may now combine a), b) and c) and define the following: 

1) Choose the set Q=(Xr\Sr\ lev*/) where -S is a cut-simpl&x (defined under 3) below). 

2) Define the corresponding polaroid 

CP(k)^{y\<p{x;y)<k,A^x^Q} 

-{y:<p{x;y)<k,A^x^(Xr[S[] \ev,f}^{y\^{x;y),^x^{Xr] lev,/)}=P(^)3lev*/. 



80 C. BURDET 

(See Lemma 8) 

3) S is the cut-simplex obtained by interesction with the polaroid CP; note that Q is defined 
by S so that S really plays the role of a variable parameter in this construction. 
Now, by virtue of the inclusion reversing property c), one may define the following (dynamic) 
geometrical representation of the interaction between S and the polaroid CP: 

• Starting with a small S (for instance Xr\S= {x}). one obtains a very large CP (since {x}*(k) 
is a halfspace) ; 

• Since CP is "enormous", a larger cut simplex SzdS can be generated by ray intersection with 
CP. 

• But, now the corresponding CP is shrinking (CPczCP) because of the inclusion reversing 
property c). 

Thus, one easily imagines that at some point, this mechanism will stop in some equilibrium posi- 
tion (usually not unique) where (S can no longer be made larger without violating the condition 
SnXaCP required to guarantee validity. 

The set CP{k) is called cutting polaroid to emphasize the interaction between the cut and 
polaroid constructions : for the vertices of the cut-simplex S one has Uj ^ bd CP{k) , -Vj ^ A^, that is : 

m.aiX<p{Uj, x)=k 

subject to x=x— X) cijtj^X 

^ltj<l,tj>0,^j; 

UN tj 

where the tj are parameters. 

A computationally simpler variant of this scheme is to start with a given hyperplane support- 
ing X at X, sa}^ 

j(N 

and to "push" it into the feasible set as far as possible to create the cut 

SM;>5(>0) 

j(N 

and the corresponding cut-simplex S; the cutting polaroid approach is then used to determine the 
(largest) value 5 > such that A^j ^ A^": 

max (p{Uj, x)<k 
subject to a;(:-X' 
* ^ajtj<d,tj>Q,^j^N 

j(N 



with 



<B 



CONVEX AND POLAROID EXTENSIONS 81 

In this case, there is only one parameter 5; note that 5 merely corresponds to the slack variable 
of the cut constraint (*) ; otherwise 8 only appears in the objective function <p. 

Thus, the cutting polaroid approach provides a means to determine a deep valid cut by para- 
metric optimization (with parameter 8). These programs have been studied in [4 and 8] in the 
quadratic case; this approach is more efficient than the (conventional) intersection approach pro- 
posed in [5], because this parametric program (linear in the quadratic case) is easil}^ solved. 

8. DISCRETE EXTENSIONS WITH RESPECT TO POLYHEDRAL SETS 

When X is polyhedral, it is well known that the program 
max /(a;), subject to x^-X' 
(/ convex) 
is either unbounded or it possesses an optimal vertex; thus, the range of/ may be restricted to the 
finite set of values /(vert X) without essentially modifying the final outcome of the above optimi- 
zation problem. A polaroid extension SPF can thus be defined as a quasi-convex step Junction of the 
following type: 

A^k^fivert X), lev, PSF= {y\<p(x; y)<k, A^x^ vert X with/(2;) <^} ; 

for intermediate values k^ (ki, ^2) one has: lev* PSF^levk, PSF where 

ki and ^2^/ (vert X), k^f (vert X). 

In fact, we can modify the definition of PSF further, by selecting in the definition of the polaroid 
only local optima of the initial non-linear problem, i.e., only the candidates for the global optimum, 
or (assuming differentiality) : 
lev* PSF— ly\<p{x; y)<k, for all those x^Xf\\eVkf satisfying the necessary conditions 



a-x)vj{x)<o,^^^x}. 



It is easily seen from the definition that lev* P»SFz3P(A:)=lev* PF; i.e., one has the dominance 
relation PSF(x)<PF{x); note also that for x^X— {vert X} one may have PSF{x)<Cf{x) so that, 
strictly speaking, PSF is no longer an extension as defined in Section 1. However, PSF enjoys the 
same properties as another polaroid extension, if one replaces X by vert X; and we therefore do 
not duplicate the results of the preceding sections. 

REMARKS: 

a) The construction of PSF, in principle, requires one to solve (globally) an extreme point 
program of the type 

max (fix; y) , y given, Qq= (vert X n lev* /) . 

This may seem a tremendous task, but any set Q satisfying (^fllev* /)z)Q3Qo may be used 
to deliver a bound k' which furnishes an approximation (from above) to PSF: 

max (p< max (p=k' . 



82 C- BURDET 

The step function PSF represents an improvement over the corresponding (i.e., with same 
polarization <p) polaroid extension PF; and this happens in two ways which are best illustrated by 
an application to concave programming. 

(i) Because the parameter k can always be chosen to belong to A= {^=/(a;)|a;^ vert X, x local 
optimum} the value of k is increased step-wise by discrete amounts, until the value 

/:=max/(a;) 

is reached. In a practical sense, this can be realized during the construction of a cut-simplex S, for 
instance; one sets 

Z:new=max \koM,j(x),j{x^), ^j^N} 

where x^ are the neighboring (on X) vertices of x i.e., 

— *^ 

X — X ZjQ/j 

where tj is the largest value tj>0 such that x^^X. 

(ii) One may replace X by the set r= {x^ vert X, x locally optimal} in the construction of a 
cutting polaroid (see Section 7) and obtain CP{k) = {rr\S)*{k). 

ACKNOWLEDGMENT 

I wish to express my thanks to R. Breu, who contributed to many improvements and correc- 
tions of the manuscript. 

BIBLIOGRAPHY 

[1] Balas, E. and C.-A. Burdet: "Concave Quadratic Programming," Management Science 

Research Report #299, Carnegie-Mellon University (1972). 
[2] Burdet, C.-A., "Polaroids: A New Tool in Non-Convex and in Integer Programming," 

Naval Research Logistic Quarterly 20, 13-24 (1973). 
[3] Burdet, C.-A., "On Polaroid Intersections," Mathematical Programming in Theory and Practice, 

P. Hammer and G. Zoutendijk eds., pp. 365-387 (North Holland, 1974). 
[4] Burdet, C.-A., "On Linearly Constrained Non-Convex Quadratic Programs," W.P. 91-72-3, 

Graduate School of Industrial Administration, Carnegie-Mellon University (1972). 
[5] Tuy, Hoang, "Concave Programming Under Linear Constraints," (Russian). Doklady Akademii 

Nauk SSSR, 1964. English translation in Soviet Mathematics, 1437-1440 (1964). 
[6] Minkowski, H., "Theorie der Konvexen Korper insbesondere Begriindung ihres Oberflachen- 

begriffes." Gesammelte Abhandlungen, 2 (Leipzig, 1911). 
[7] Rockafellar, R. T.: Convex Analysis. (Princeton University Press, 1970). 



A CUTTING PLANE ALGORITHM FOR THE BILINEAR PROGRAMMING 

PROBLEM 



H. Vaish 

California State University, Northridge 

Northridge, California 

C. M. Shetty 

Georgia Institute of Technology 

Atlanta, Georgia 



ABSTRACT 

In this paper we discuss the properties of a Bilinear Programming problem, and 
develop a convergent cutting plane algorithm. The cuts involve only a subset 
of the variables and preserve the special structure of the constraints involving the 
remaining variables. The cuts are deeper than other similar cuts. 



L INTRODUCTION 

The Bilinear Programming Problem considered in this paper can be stated as: 
BLP: Minimize <^(x, y)=c'x-\-d'y-\-x'Cy 
(1) Subject to: xtXo={x€R"'\Ex<e, x>0} 
yeYo=lyeR'^\Fy<J,y>0} 

Without loss of generality, we will assume that Xq and Yo are bounded. In spite of its special 
structure, problem BLP is a mathematical statement of a number of practical problems, for example, 
location-allocation problems, orthogonal production scheduling, multi-stage assignment problem, 
etc. (See [23] and [28]). 

An important property of BLP to observe is that even though (f> can be shown to be not quasi- 
concave, the optimal (x*, y*) is attained at an extreme point of XqXYo, i.e., x* will be an extreme 
point of Xq and y* is an extreme point of Yo [23, 29]. It seems reasonable that in solving BLP this 
extreme point optimality property of BLP should be taken advantage of. However, <p is not ex- 
plicitly quasi-convex so that local minima can and do exist [28]. This is the essential difficulty in 
solving BLP. 

Problem BLP can be regarded as a Quadratic Progrkmming problem in which the objective 
function need not be convex. A number of algorithms have been proposed for solving this class of 
problems. One group of these algorithms generates a sequence of expanding polytopes such that 
the minimum over each is known. This annexation procedure terminates when some polytope in the 
sequence contains the original polytope [13, 27, 29]. Some alternative approaches are discussed in 
[10, 24]. 

83 



84 H. VAISH AND C. M. SHETTY 

The strategy we will use to solve the problem will be to develop a series of cuts such that no 
point with a lower value of 0(x, y) than the current best available is deleted. The approach is 
therefore the piecewise-strategy discussed by Geoffrion [14] for solving large-scale problems. The 
process is repeated until all the feasible region has been explored. 

II. CUTTING PLANE STRATEGIES 

Cutting planes have been used in integer programming for some time. Recent developments 
in [1, 2, 6, 15, 17, 19, 30] show how valid cuts can be generated using certain convex sets. On the 
other hand, several authors have used the cutting plane approach to solve nonconvex problems, 
e.g., see [3, 5, 11, 12, 13, 21, 26, 27]. The recent work of Burdet unifies these approaches through 
the use of Polaroids [6, 7, 9] and the related more general concept of level sets of convex gauge 
functions [8]. Our main concern here is to use Burdet's approach to solve problem BLP. However, 
as we will see, some modifications need to be done. 

Consider a problem P: Mm.f(x), subject to xeS, where jS is a polyhedral (compact) set and 
/is nonconvex. A local star minimum of P is an extreme point x such that/(x) <j{x) for each xtN(x) 
where N(x) denotes the adjacent extreme points to x. A local minimum of P is a point x such that 
/(5)</(a;) for each xe5s(x) fl'S', where Bi(x) is a 5-neighborhood around x. If / is quasi-concave, 
then a local star minimum is also a local minimum. In such a case, a cutting plane can be developed 
from a local star minimum x which cuts off x but not any other improving point as was done in 
[5, 27]. On the other hand, if the assumption does not hold, a local minimum x is needed to define 
a cutting plane [3, 11, 26]. 

There is yet another important property we would like to preserve. Cuts involving variables 
associated with both Xq and Fq sets will destroy their special structure. There are problems wherein 
one of the sets does have a special structure for which efficient algorithms exist that can be used 
to solve sub-problems in the solution procedure. In the location-allocation problem, the Fq set can 
be made to represent the transportation problem constraints. 

Konno [23] has discussed this strategy in detail for the BLP. At a local star optimim (x, y) he 
defines a vector ^' and finds the parameter value c such that the cut g'x>a deletes x but no point x 
such that (t>{x, y)<k—e where k is the objective function value for the current best solution and 
f>0 is a predetermined value. The procedure thus yields an t-optimal solution, i.e., a point {x*, y*) 
such that <j>{x*, y*)<<l>{x, y)-\-eior all xeXo and yeYg. 

Gallo and Ulkiicii [13] have developed a cutting plane algorithm for BLP similar to the one 
discussed in this paper, where the cuts are applied in the ^-set. Using duality theory they consider 
the following equivalent problem: Min. ^(x, u) = {c'x-{-Max fu) subject to xeXo and ueU^{u: 
F'u<d-{-Cx, u<0}. For any oi^eXo, let u" be the point obtained by solving M&xJ'u subject to 
F'u<d-\-Cx'', u<0. A cutting plane is generated from a vertex x* such that ^(a:*, u'')'^k where k is 
the current best value of the objective function. Thus, if the vertex x* is in fact the current best, 
one has to move to the point yielding a poorer value of ^. If no such point exists, special steps need 
be taken. The algorithm is not guaranteed to converge. 

Earlier we have indicated that cuts involving both the x and y variables can be generated from 
a local minimum. However, if we want to develop a cutting plane involving only the x-variables 
and yet be convergent, we need to develop the cut at (x, y) which is more than a local optimum. 



i 



CUTTING PLANE ALGORITHM 85 

Such a point which is adequate is defined below. Throughout the rest of the paper by Xq we mean 
the original feasible region X^ or its subset obtained after the introduction of cuts. 

DEFINITION: An extreme point {x, y) is called a pseudo-global minimum ij 0(x, y)<<i>{x ,y) 
for each xtBi{x) D-X'o and for each yeY^. 

Note that (x, y) is an extreme point of the constraints of BLP if and only if x is an extreme 
point of Xq and y is an extreme point of Fq. Further, an extreme point is adjacent to {x, y) if and 
only if it is of the form (a;*, y) or (x, y') where x^eN{x) and 7/eN{y). We will now characterize the 
various forms of "optimality" for BLP which will be used later in the development of the algorithms. 
Consider an extreme point (x, y). One can readily transform the origin to this point. Konno [23] has 
shown that such a transformation will yield a problem of the form given in (1) (with the additional 
property that c >0,/>0). The following theorems refer to the transplated problem with 

(x,^) = (0,0). 

THEOREM 1 : The origin (0, 0) is a local star minimum of BLP if and only if x=0 solves the 
linear program Pi: Min <^(a;, 0), xtX^, and i/=0 solves the linear program P2: Min <^(0, y), yeYo. 

PROOF: Let x=0 solve Pi and y=0 solve P2. Then <^(0, 0)<<^(x, 0), xeXo. In particular, 
<^(0, 0)<<i>{x\ 0), a;*€iV(0). Similarly, </.(0, 0)<(f>iO, y'), y'eN{0). Hence, 

<f>{0,0)<<{>{x\y'), (x\y^)eN(0,0). 

Hence, (0, 0) is a local star minimum. Conversely, let (0, 0) be a local star minimum. 
Hence. <^(0, 0)<(^(0, ?/*), y^eN(0). Consider problem P2. Suppose we apply the simplex algorithm 
and obtain a basic feasible solution corresponding to y=0. Since <f>{0, 0)<<f>{0, if) for each y^eNiO), 
the simplex method will terminate and is a solution to P2. Similarly, x=0 solves Pi. 

THEOREM 2: The origin (0, 0) is a global minimum of BLP if c'x>0, d'y>0 and x'Cy>0 
for each xeXo, yeYo- 

PROOF: Trivial, since the hypothesis implies <t>{x, y) >0(O, 0) for each xtX^ and t/eFo- 
THEOREM 3 : The origin (0, 0) is a local minimum of BLP if and only if for each xeXo and 

(i) c'x>0 and d'y>{), and 

(ii) if x'Cy<0 then {c'x+d'y)yo. 

PROOF: Suppose conditions (i) and (ii) hold. Then for £>0 small enough <i>{ix, ey) = e{c'x+ 
d'y-{-(x'Cy) > 0=0(0, 0) since the term within parenthesis is positive. Hence (0, 0) is a local mini- 
mum. 

Now suppose (0, 0) is a local minimum. If c'a:<0, then <l>(ex, 0) = €c'x<0 = <^(0, 0) contradicting 
the local optimahty of (0, 0). Hence, c'x>0, and likewise d'y>0. Now if x'Cy<iO and (c'x->rd'y)<0, 
then (f>{ex, €y) = e{c'x+d'y-{-ex'Cy)<CO=<t>{0, 0) for all €>0 again contradicting the local optimality 
of (0, 0). Hence, both conditions (i) and (ii) hold. 

THEOREM 4 : The origin (0, 0) is a local star minimum of BLP if and only if c'x > and d'y > 
for xeXo, yeYo. 

PROOF: Suppose (0, 0) is a local star minimum. Then from Theorem 1 x=0 is a solution to 
the problem: 

Min <f>{x, 0)=c'x, xtXo. 



86 H. VAISH AND C. M. SHETTY 

Hence, c'x>0 for each xeXo- Likewise d'y > for each yeVo- Conversely, let c'x > and d'y > for each 
xeXo and yeYo- But ^(a;, 0)=c'x and <^(0, 0) = 0. Hence x=0 solves: Min (i>(x, 0), xeXo- Likewise 
y=0 solves: Min <^(0, y), yeYo- Then from Theorem 1 we have (0, 0) is a local star minimum. 

THEOREM 5: The origin (0, 0) is a pseudo-global minimum of BLP if and only if for each 
xcXq and yiYo. 

(i) d'y>0, and 

(ii) if c^V=0 then (c'a;+a;'C2/)>0. 

PROOF: Suppose (0, 0) is a pseudo-global minimum. By definition it readily follows that it is 
a local minimum. Hence, from Theorem 3 we have d'y>Q. Now for xtXo, axeBsiO) ("1 -^o for a>0 
small enough. Then <i)(ax, y) = a{c^x-{-x'Cy)-\-d'y><f>{0, 0) = for each yeYo since (0, 0) is a 
pseudo-global minimum. Hence, if d'y=0 then c'x-\-x'Cy>0. 

Now assume that conditions (i) and (ii) hold. Consider the m edges from a;=0 to the adjacent 

extreme points. Let 0?^a;*e.X'o be on the i"" edge. Let y^, . . ., y'^ denote the K extreme points of 

Yq and let 

l-dYKc'x'+x'Vy") if ic'x*-{-x''Cf)<0 
"'*~1 00 otherwise 

Note that q:,j:>0 by assumption (i) and (ii). Let 

ai= min Uiic. 
k = l K 

If all «,*= 00 we will let at to be an arbitrarily large number. 

Then <i>{aix\ y'') = ai{c'x^-\-x''Cy'')-\-dy''>0 from conditions (i) and (ii) and the definition of «<. 

But, the minimum of <t>(x, y) for a fixed x is achieved at an extreme point of Yq. Hence, 

Min<^(Q;iX', i/)>0 

for each i=\, . . ., m. 

Now let AS'=Conv. [0, a^a;' for i = l, . . ., m]. Then Min <i>{x, y), xeS, yeYo is achieved at an 
extreme point of S, which is or of the form atxK Hence, 

Min <f>ix, y) >Min. {<^(0, 0), Min <i>{aix\ y) } >0. 

X(S yt Yo 

Now select a5>0 such that 5^(0) fl Xo<^S. Then Min <t>{x, y)>0 for each xeBi{0) n Xo and for each 
yeYo. That is, (0, 0) is a pseudo-global minimum. 

It may be mentioned that the corresponding theorems in [23] can be obtained as special 
cases of Theorems 2,3, and 4. Further, from Theorems 3 and 4 we may observe that a local minimum 
is always a local star minimum. The converse may not be true as shown in [28]. Further, from the 
definitions one can note that a pseudo-global minimum is always a local minimum. 

m. GENERATION OF A PSEUDO- GLOBAL MINIMUM 

Before discussing the procedure for getting a pseudo-global minimum, we will review the method 
due to Balas [1] for identifying precisely m edges incident on an extreme point x, which also leads 
to the resolution of degeneracy. 



CUTTING PLANE ALGORITHM 87 

Let X be an extreme point of Xq and let pj, jeJ, be the m nonbasic variables at x, where J is 
the index set for the nonbasic variables. Denoting by I' the columns of the simplex tableau in 
Tucker form (extended form) , the m-vector x can be written as : 

x=x— y^.Ppi 

which satisfies the constraint Ex<e (but not necessarily the nonnegativity restriction) for any 
P;>0. Let M and N denote the index sets associated with the constraints Ex+u=e and x>0 
respectively, i.e., 



^0= xeBr; S e,jXj-\-Ui=eu uN and x,>0, jtM 
{ i=i 

Given a basic feasible solution (x, u), let N°={ieN: Ut is basic and ^i=0}, M°= {jeM: Xj is basic 
and Xj=0}. Let 

( 771 
xiR"". ^eijXj+Ui=ei, ieN-N°; x,>0, jeM-M" 

Then clearly, Xq^^X'o since X'o is obtained by deleting constraints of Xq. 

THEOREM 6 [1]: Let x be an extreme point solution obtained by solving a linear program: 
Min. /3'x, xeXo. Let X'o be defined as above. Then x is a vertex of X'o and 0'x=l3'x is a supporting 
hyperplane for X'o- Besides, X'q has precisely m distinct edges incident on x and each half line 

^^—{x: x=x—'e^Xj, >^j'>0}, jeJ 

contains exactly one such edge. 

We will now present two algorithms to generate a pseudo-global minimum. 

ALGORITHM A. 

L Find a feasible extreme point x^ of ^o- 

2. (a) Solve: Min (^(x\ y), yeYo, to yield an optimal y^. 
(b) Solve: Min <j>{x, y^), xeXo, to yield an optimum x^ 

Repeat until the procedure converges to a point (x, y), which clearly is a local star minimum. 

3. Generate all alternative optimal extreme point solutions 2/S . • •, 2/*(^>l) to Min. <^(x, y), 
yeYo- Solve: Min <^(x, y'), xeXo for i=l, . . ., k to yield solutions x\ . . ., x*. If <^(x, y)<<t>(x^, y^) 
for all i, terminate; (x, y) is a pseudo-global minimum. If 4>{x^, 2/0<C<^(^> V) for some r go to step 2(a) 
with x^=x'. 

ALGORITHM B. 

1 and 2. Find a local star minimum as in steps 1 and 2 of Algorithm A. 

3. Let x\ . . ., x"* be the adjacent extreme points of x. Solve: Min <i>{x\ y), yfYo, to yield solu- 
tions y^, . . ., y'^. If <p(x, y)<<t>{x\ y^) for all i, terminate; (x, y) is a pseudo-global minimum. If 
4>{^', y')<<f>ix, y) go to step 2(b) with y^=-y\ 

It is intuitively evident that Algorithm B yields a pseudo-global minimum. However, to 
implement it we need a ready means of identifying the adjacent extreme points of x. Also note 
that both of the algorithms are finite. 

LEMMA 7 : Let the origin be translated to the point (x, y) obtained either by Algorithm A or 
by Algorithm B. Then (0, 0) is a pseudo-global minimum. 



88 H. VAISH AND C. M. SHETTY 

PROOF: First consider Algorithm A. The algorithm is terminated at a point (x, y) which is 
clearly a local star optimum from Theorem 1. Hence, from Theorem 4 we have d'y>0. Now con- 
sider a yeYo satisfying d'y—0. Clearly it is an alternative optimum to Min 0(0, y), yeYo and can 
be expressed as 

2/=Z;x,2/M:x, = 1, X,>0 

! = 1 

where y* are the alternative extreme point optima at step 3. Hence, for xe-X^o using the notation of 
step 3, 

c'x+x'Cy=4>{x, y)=4>{x, j:\iy') = ^'Ki<t>{x, y')>J2K<t>ix', t)>4>{0, 0)=0. 

Hence, from Theorem 5, (0, 0) is a pseudo-global minimum. 

Now consider Algorithm B. Let S= conv [0, x^, . . ., x"^] where x^eN{0). Then Min. <t>{x, y), 
xeS, y(Yo is achieved at an extreme point of S and Yq- From step 3 of the algorithm 

<l>{0, 0)<<t>(x\r)<(l>(x,y) 

for each xeS, yeYo. Select a 5>0 such that ^^(O) fl XqCzS. Then clearly (0, 0) is a pseudo-global 
minimum by definition. 

IV. A CONVERGENT CUTTING PLANE ALGORITHM 

We will now show how a cut can be generated from a pseudo-global minimum using the con- 
cept of generalized polars [6, 7]. 

DEFINITION : The generalized polar of Yq for a given scalar k is given by Y°(k) = {x: <j>(x, y)>k 
for all yeYo] 

By definition Y^{k) contains no point xeX^ such that 0(x, y)<Ck for some ytYo. Hence, if 
k is the current best value of the objective function, the problem is solved if XoCzF°(^). 

Further, it can easily be verified that Y'^i'k) is compact and convex. As a matter of fact it is 
polyhedral since it can be shown [28] that 

Y\k)=^{x; <i>{x, y') >k for all yUV] 
where V is the set of extreme points of Yq. 

We will now discuss how a valid cut can be generated involving only the x-variables using 
generalized polars. Let (x, 7) be a pseudo-global minimum of BLP and let 'p'=(x', «')• Suppose the 
current best value of is k, which may or may not be equal to <^(x, y). 

DEFINITION: Given m positive scalars Xi, X2, . . . , X,„, then the inequality 

is a valid cutting plane with respect to p' = {x', w') if 

but 

for all ptP such that <j>{x, y)<Ck for some yeYo where P= {(x', u') : Ex+u=e, x>0, w>0}. 



CUTTING PLANE ALGORITHM 89 

A valid cutting plane thus cuts off the pseudo-global minimum but does not cut off any feasible 
point of Xo which along with a yeYo yields an objective function value smaller ihan k. 

The following theorem states how a valid cutting plane can be generated from a pseudo-global 
minimum. 

THEOREM 8: Let (x, 1) be a pseudo-global minimum and let the rays ^^ be as defined in 

Theorem 6. Let k be the current best value of <f>. Let X^ be defined by 

X,=max [X/. x-e%eY'{k) } if e^Y'{k) 
= 00 if ^^ c Y^ik) for all X^ > 0. 



Then the inequality 



is a valid cutting plane. 

PROOF: For notational convenience, let us translate the origin to (i, y). Since <^(0, y) >0(O, 0) 
for all y«Fo, OeF°(A:). From Theorem 6, X' q has precisely m edges incident on and each ^^ contains 

one such edge. Let F^xeX'o be on ray ^K We will show that there exists a>0 such that <l>iajX, y)>0 
for 0<a^<a for all ytY". Suppose, on the contrary 

(2) <l>iajx, y)=^aj{c'x+x'Cy)-\-d'y<0 

for some yeYo. Since (0, 0) is a pseudo-global minimum, d'y>0 by Theorem 5. If d'y=0, by Theorem 

5 (c'x-\-x^Cy)>0 for all xeXo, i.e., <i>{x, y)><(>{0, y) since d'y=0. This implies x=0 solves Min 

<t){x, y), xeXo. From Theorem 6 then <f>(x, y) = <j>{0, y) is a supporting hyperplane to X\ at x=Q. 

That is, <j)(x,y)=c'x-\-x'Cy>0 for each xeX'o and in particular for x=x. Thus, <t>iajX, y) >0 for all 

q:j>0, a contradiction. On the other hand, if d'y^O, from the expression for <l>{ajX, y), we have 

4>{ajX, y) >0 for all 

A \-d'yl{A+x'Cy) if (c'x+x'Cj)<0 

00 otherwise 



Again this is a contradiction. Now let 



a= mm Uj 

j 



where a can be let to be arbitrarily large if all «;= oo. We have <^(q;,x, y) >0 for 0<aj<a for all 
2/€F°. In other words, ajX=(x-'~JX,)tY°{k) with Xj>0 for each ./eJ. Hence, noting that F°(Z:) is 
closed, X;>0 exists, i.e., U<l/X,< oo and {^—7Aj)tY''{k). 

To show that the cutting plane generated at (0, 0) is a valid cut, first note that 

S ^A=0<1 
since the nonbasic variables 



■-€) 



90 H. VAISH AND C. M. SHETTY 

are zero. To complete the proof, consider a feasible 

-(I) 

such that 4>{x, y)<Ck for some yeVo- Note that feasibility implies p>0. We will show that 



Note that by definition of X;, 






Tam<t>{—e'\j, y)=k or 0(— e^X,, y)>k. 



Hence, x is in one open half space defined by 0(x, y)=k and (—e'Xj), jej are in its complement 
which is a closed half space. Thus, p is feasible to the cut 

since the hyperplane '^Vtl^j—^ passes through the points (— e-'Xy). This completes the proof. 

In order to define a cutting plane we then need to determine X^ specified in Theorem 8. By 
definition, for ^'=1, . . ., m 

(3) X^= max Jmin c'(i— e"'X^) -\-d^y+ {x—e%)^Cy>k\ 

This amounts to solving m parametric linear programming problems over Fq- One way of imple- 
menting this will be as follows : let 

^(X,) = min {c' (x-?%) +(f' y + (x- e%) 'Cy}. 

Since ^ is a concave function of X^ (see [20]), it is unimodal. We can conduct a search for X^ over 
an interval (0, Z) where i is a large enough number, using Bolzano Search [31]. A linear program- 
ming problem is solved for a fixed value of X, in the interval. Note that we elected to apply the cut 
only in the x-variables in order to take advantage of ease in solving problems in the ^/-variables. 
At each iteration of the search, the "interval of uncertainty" is reduced by 1/2. The search is ter- 
minated when a value of Xy=Xj is obtained such that^(Xj)— A:<|8 where /8>0 is a prespecified tol- 
erance level. Observe that Xj<X, and hence the cut using X, is also a valid cut. The search proce- 
dures will require the solution of at most (;p+l) linear programs where p is the smallest positive 
integer for which L/2''<;8. 

The proposed cutting plane algorithm can be summarized as follows: 

1. If the unexplored feasible region Jlq* at stage i is empty, terminate. Otherwise, find a pseudo- 
global minimum. If necessary, update the value of the current best solution. 

2. Solve the parametric linear programs to obtain X^. 

3. Introduce the cut and return to step 1. 

Convergence of the algorithm is readily proved. Let {x*, ?/'} be the sequence of pseudo-global 
minima generated and B.{-x>) be tho cutting plane generated at x*. At stage i, the algorithm is 
terminated if XonH+(x*)=«A. Otherwise, the cut H(x*) is applied and a new pseudo-global point 



CUTTING PLANE ALGORITHM 91 

(x'+\ y*"*"^) is found where x^'^^eXof\H'^{x^) and x' iH'^{x'). We wish to show that the sequence {x^} 
has a limit point a;* such that Xor\H'^{x'') = (t>. 

Since the points x' are in a compact set, there exists a limit point x^ such that for a given e>0 
and a positive integer N, Hx' — a;*||<£ for some i>N. If Xo[]H'^(x'')7^<t>, all subsequent pseudo- 
global minima (x', y') generated will satisfj' the condition a;'€//+(x*), l>k+l. Also, from the proof 
of Theorem 8 x'eBiix') nX'> and x' t xB^ix") for some 5>0. Hence, ||a;'-x*||>5 for all l>k-\-l. 
This contradicts the statement that x* is a limit point. Hence, X° fl H'^ (x'') = ({> and the algorithm 
is terminated. 

We can readily show that the cuts generated using generalized polars are uniformly stronger 
than those generated by Konno [23]. This cut is of the form 

where the dj are positive constants which are selected arid o- is a parameter which is defined to 
satisfy certain conditions. For each jtJ, Konno calculates the maximum value of dj such that 
{x—l^(Xj/dj)€Y°{k), and then selects a such that ff/dj<(rj/dj for all jeJ. Now (Tj/dj<'\j, hence ff/dj<Xj 
for all jeJ. Hence the cuts generated from Y°(k) are stronger than those generated by Konno's 
method. 

There is a revealing geometric interpretation to this difference. Konno predetermines the 
coefficients dj of the cutting plane so as to simplify computational work. But this has the effect fo 
fixing the slope of the hyperplane. The hyperplane is now translated parallel to itself till such 
time as one point on it touches the boundary of Y°{k). The polar cut allows the additional flexi- 
bility of altering the slope of the cutting plane so as to cut off more of the feasible region. 

V. A FINITELY CONVERGENT e-OPTIMAL ALGORITHM 

The above algorithm can be converted to a finitely convergent e-optimal solution procedure 
by using Y°{k-e) instead of Y°{k) in determining X_,. In this case, a cut can be initiated at a local 
star minimum rather than a pseudo-global minimum. Note that from Theorem 1, 

<t>(0,y)><f>i0,0)>k>k-e, 

for each yeYo. Hence, Oe int Y°{k — e). This implies that there exists a a such that (piajX, y)>k 
for 0< aj<a. and for all ytYo where x is as defined in the proof of Theorem 8. The remaining parts 
of Theorem 8 hold. Hence, a valid cut can be generated from a local star minimum. 

We will now show that the algorithm is finite. At stage i, suppose the local star minimum is 
(x, y) and let kt be the current best value of <f> attained at (x, y), which may or may not be the 
same as <^(x, y). Then (x, y) is the e-optimal solution over all yeYo and over Xo already explored, 
i.e., for all x in Xq, but not in Xo"'"'. Note that kt is decreased at each stage at least by a fixed 
€>0. Hence, denoting by (x, y) the global optimum of BLP, if we show that x is cut off by a cut 
obtained from Y°(k) for some k, then the algorithm is finite. 

Now consider BLP with (x, y) as the local star minimum. If (x, y)eX'o^^ is the global minimum, 
then X is in the cone with vertex x and ^^, jtJ, as the generators. Then x can be expressed as a 
convex combination of points x^, 'jtJ, on these generators. 



92 H. VAISH AND C. M. SHETTY 

Let 

a,= min (f>(x\ y) 

ytYo 

and 

k= min aj. 

Then Y°(k) will cut the rays ^^ at points (0, . . ., Xy, . . ., 0) >xK Hence, 

obtained from Y°{k) will cut off x. This shows that the e-optimal algorithm is finite. 

VI. GENERATION OF DEEPER CUTS 

We are grateful to the referee for bringing to our attention some recent work of Burdet [8, 9] 
dealing with convex gauge functions which permits generation of deeper cuts than that given in 
Section IV. This approach allows X^ to be negative and generates uniformly dominating cuts. Owen 
[25] seems to be the first to suggest this aproach, and related work on negative edge extensions 
has been investigated in [4, 16, 18, 22]. Using these results the cut 

can be strengthened by using the following definition for X^. 

(max {Xj< 00 : <^(i-?%, y)>k for all yeY''] if ^^qt Vik) 



X,=( 



x,>o 
[max {x,< oo: <j>{x+e%, y) >k for some yeF"}] if ^^(^Y^k) 

X,>0 



If i'c^Y'^{k), 0<Xy< CO can be computed from equation (3) given earlier by solving parametric 
transportation problems. If i,^ciY°{k) as pointed out by the referee, the value of X_,<0 can be com- 
puted by solving parametric problems similar to that used for (3). More specifically, in this case 
we need to solve the parametric problem: 

X;= — [max {X;: max (^(i+c%, y) >k\]. 

X,<0 !/«Ko 



REFERENCES 

[1] Balas, E., "Intersection Cuts — A New Type of Cutting Planes for Integer Programming," 

Operations Research, 19, 19-39 (1971). 
[2] Balas, E., "Integer Programming and Convex Analysis: Intersection Cuts from Outer Polars," ( 

Mathematical Programming, 2, 330-382 (1972). 
[3] Balas, E., "Nonconvex Quadratic Programming Via Generahzed Polars," Management 

Science Research Report No. 278, GSIA, Carnegie-Mellon University (1972). 
[4] Balas, E., "Disjunctive Programming: Properties of the Convex Hull of Feasible Points," 

Management Science Research Report 348, GSIA, Carnegie-Mellon University (1974). 

il 



CUTTING PLANE ALGORITHM 93 

Balas, E., and C. A. Burdet, "Maximizing a Convex Quadratic Function Subject to Linear 

Constraints," Management Science Research Report No. 299, GSIA, Carnegie-Mellon 

University (1973). 
Burdet, C. A., "Polaroids: A New Tool in Nonconvex and in Integer Programming," Naval 

Research Logistics Quarterly, 20, 13-22 (1973). 
Burdet, C. A., "On Polaroid Intersections" in Mathematical Programming in Theory and 

Practice, P. Hammer and G. Zoutendijk, eds. (North Holland, 1974). 
Burdet, C. A., "Elements of a Theory in Nonconvex Programming," to appear in Naval 

Research Logistics Quarterly. 
Burdet, C. A., "Convex and Polaroid Extensions," to appear in Naval Research Logistics 

Quarterly. 
Cabot, V. A., and R. L. Francis, "Solving Certain Nonconvex Quadratic Minimization Prob- 
lems by Ranking the Extreme Points," Operations Research, 18, 82-86 (1970). 
Candler, W. and R. J. Townsley, "The Maximization of a Quadratic Function of Variables 

Subject to Linear Inequalities," Management Science, 10, 515-523 (1964). 
Cottle, R. W., and W. C. Mylander, "Ritter's Cutting Plane Method for Nonconvex Quadratic 

Programming," in Integer and Nonlinear Programming, J. Aladie, ed., (North Holland, 

1970). 
Gallo, G., and A. tllkucu, "Bilinear Programming: An Exact Algorithm," Report ORC 73-26, 

Operations Research Center, University of California, Berkeley (1973). 
Geoff rion, A. M., "Elements of Large-Scale Mathematical Programming," Management 

Science, 16, 652-691 (1970). 
Glover, F., "Convexity Cuts and Cut Search," Operations Research, 21, 123-134 (1973). 
Glover, F., "Polyhedral Convexity Cuts and Negative Edge Extensions," Zeitschrift fiir 

Operations Research, 18, 181-186 (1974). 
Glover, F., "Convexity Cuts for Multiple Choice Problems," Discrete Mathematics, 6, 221-234 

(1973). 
Glover, F., "Polyhedral Annexation in Mixed Integer and Combinational Programming," 

Mathematical Programming, 8, 161-188 (1975). 
Glover, F., and D. Klingman, "Concave Programming Applied to a Special Class of 0-1 

Integer Programs," Operations Research, 21, 135-140 (1973), 
Hillier, F. S., and G. J. Lieberman, Introduction to Operations Research (Holden-Day, 1974). 
Hu, T. C, "Minimizing a Concave Function in a Convex Polytope," Mathematics Research 

Center Report No. 1011, U.S. Army, Madison, Wisconsin (1969). 
Jeroslow, R. G., "The Principles of Cutting-Plane Theory: Part 1," with an addendum 

GSIA, Carnegie-Mellon University (1974). 
Konno, H., "Bilinear Programming," Parts I and II. Technical Report No. 71-9 and 71-10, 

Operations Research House, Stanford University (1971). 
Mueller, R. K., "A Method for Solving the Indefinite Quadratic Programming Problem," 

Management Science, 16, 333-339 (1970). 
Owen, G., "Cutting Planes for Programs with Disjunctive Constraints," Journal of Optimiza- 
tion Theory and Applications, 11, 49-55, 1973. 
Ritter, K., "A Method for Solving Maximum-Problems with a Nonconcave Quadratic Objec- 
tive Function," Z. Wahrscheinlichkeitstheorie verw., 4, 340-351 (1966). 



94 H. VAISH AND C. M. SHETTY 

[27] Tuy, H., Concave Programming Under Linear Constraints (Russian) Doklady Akademii Nauk 
SSSR (1964). English translation in Soviet Mathematics, 5, 1437-1440 (1964). 

[28] Vaish, H., "Nonconvex Programming with Applications to Production and Location Prob- 
lem's," Unpublished Ph.D. Dissertation, Georgia Institute of Technology (1974). 

[29] Vaish, H., and C. M. Shett}^ "The Bilinear Programming Problem," Naval Research Logistics 
Quarterly, 23, No. 2 (1976). 

[30] Young, R. D., "Hypercyclindrically Deduced Cuts in Zero-One Integer Programming," 
Operations Research, 19, 1393-1405 (1971). 

[31] Zangwill, W. I., Nonlinear Programming — A Unijied Approach, (Prentice-Hall, 1969). 



to 



THE EFFECT OF CORRELATED EXPONENTIAL SERVICE TIMES ON 

SINGLE SERVER TANDEM QUEUES 



C. R. Mitchell 
U.S. Air Force Academy 

A. S. Paulson 
Rensselaer Polytechnic Institute 

C. A. Beswick 
University of South Carolina 



ABSTRACT 

An investigation via simulation of system performance of two stage queues in 
series (single server, first-come, first-served) under the assumption of correlated 
exponential service times indicates that the system's behavior is quite sensitive to 
departures from the traditional assumption of mutually independent service times, 
especially at higher utilizations. That service times at the various stages of a tandem 
queueing system for a given customer should be correlated is intuitively appealing 
and apparently not at all atypical. Since tandem queues occur frequently, e.g. 
production lines and the logistics therewith associated, it is incumbent on both the 
practitioner and the theoretician that they be aware of the marked effects that may 
be induced by correlated service times. For the case of infinite interstage storage, 
system performance is improved by positive correlation and impaired by negative 
correlation. This change in system performance is reversed however for zero inter- 
stage storage and depends on the value of the utilization rate for the case where 
interstage storage equals unity. The effect due to correlation is shown to be statis- 
tically significant using spectral analytic techniques. For correlation equal unity and 
infinite interstage storage, results are provided for two through twenty-five stages 
in series to suggest how adding stages affects system performance for p>0. In this 
extreme case of correlation, adding stages has an effect on system performance 
which depends markedly on the utilization rate. Recursive formulae for the waiting 
time per customer for the cases of zero, one, and infinite interstage storage arc 
derived. 



1. INTRODUCTION 

First, we describe two physical settings where dependent service times can be expected to arise. 
In a paper mill, large rolls of paper typically pass through an inspection or winding operation prior 
to being cut into smaller rolls. A poor quality roll takes a relatively longer time in the inspection 
process because defective sections must be removed and splices made. When this same roll reaches 
the final cutting stage it must be processed more slowly to avoid breaking the splices and to repair 
them when they do break. Hence process times at the two stages tend to be correlated ; indeed, it is 
conceptually possible that they be highly correlated. The process times at the two stages on any 

95 



96 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 

two different rolls would generally be independently distributed. In the current context consider- 
able interest would be centered on the effect, if any, produced by nonindependence of process times 
at different stages. 

Jackson [9], in discussing queueing systems with phase type service, pointed out that a typical 
sequence of events in the overhaul of an aircraft engine consists of stripping, detailed examination, 
repairs, assembly, and testing. Generally, an engine with a large number of maintenance require- 
ments can be expected to spend more time in each of the latter four phases and so the possible effect 
of correlated service times on throughput time would be of interest. It is not difficult to envision a 
host of other situations involving queues in series in which the service times at the various stages 
for a given customer are correlated. 

A large proportion of the literature concerning tandem queii's has centered on Poisson arrival 
processes, exponential service times, and steady state solutions. The assumption of independence of 
service times is intricately interwoven into the fabric of the traditional birthdeath equation ap- 
proach to finding a transient and steady state solution to the tandem queueing phenomenon. We 
shall remain within this same framework with the exception that we shall drop the heretofore uni- 
versal (but tacit!) assumption of mutual independence of all exponential service times. An obvious 
approach is to use a multivariate exponential distribution with non-zero correlations in place of the 
usual independent exponential service times. In our situation it is not clear that the birthdeath 
equation approach can be modified to incorporate dependent service times. Moreover, any such 
formulation would very likely be analytically intractable. The problem is, however, amenable to a 
simulation approach and it is in this way that we assess the effect of departures from independence 
of service times on steady state system performance. 

2. THE SERIES QUEUEING SYSTEM AND RECURSIVE FORMULAE 

Consider the series queueing process depicted in Figure 1 . Customers from an infinite popula- 
tion arrive at a two stage s3^stem according to a Poisson process with mean rate X which we shall, 
without loss of generality, take to be unity. An unlimited queue is always allowed before the first 
stage, but before the second stage the queue length may be either restricted or unlimited. A single 
server is allowed at each stage; the service discipline is first-come, first-served. 

CUSTOMERS ARRIVE IN 
ACCORDANCE WITH A POISSON STAGE I STAGE 2 



PROCESS WITH INTENSITY X 



n+2,^nti,"'n,''n-i 



SINGLE 

ELEMENT 

SERVER 



INTERSTAGE 
STORAGE OF 
CAPACITYq-l 



SINGLE 

ELEMENT 

SERVER 



CUSTOMER SERVICE TIMES ARE 
GOVERNED BY A BIVARIATE 
EXPONENTIAL DISTRIBUTION WITH 
MEANS/A, AND^j AND CORRELATION 

P,- .25 < p <l.0. 



Figure 1. — Two stage series queue with dependent service times. 

The system performance measure is taken to be mean waiting time per customer and in this 
section we develop a set of formulae to recursively^ compute the waiting time per customer. We 
use the rccursivri formulae for the unlimited interstage storage case in order to demonstrate pre- 
cisely how the queueing system consisting of two stages in series with dependent service times is 



CORRELATED EXPONENTIAL SERVICE TIMES 



97 



related to a single server system with interdependent arrival and service processes as discussed by 
Bhat [2]. An interpretation of Conolly [4] for a special type of this latter interdependence is shown 
to be helpful in suggesting why mean waiting time is affected by correlated service times. 

Denote by (T^.i, Tn.2) the times between arrival epochs of customers c„_i and c„ at the first 
and second stages and let c„ experience the service times (<S„,i, S'„,2) at each stage, 7i = l, 2, . . . . 
The sequences of interarrival times (r„,i, Tn.2} and the {Sn,\, S„,2) for different customers are 
both assumed to be mutually independent. 

We take (Wn', Wn^) to be the waiting times, excluding service, and iWn\ W"') to be the 
total waiting times, of customer c„, at the respective stages, n=l, 2, . . . . We illustrate these 
definitions with an arbitrarj^ combination of arrival and service times in the following diagram. 
The illustration is for two queues in series with unlimited interstage storage ; diagrams like this are 
useful in developing the recursive formulae for the different cases to be presented. 



Cr, ARRIVES 
1st STAGE 
IME 




c„ DEPARTS 1st 

STAGE, ARRIVES 

2nd STAGE 


c,+i ARRIVES 
1st STAGE 


c„+i DEPARTS 

1st STAGE, ARRIVES 

2nd STAGE 




KiS 


wii^ 




W!^^ 














Wi« 












USTOMER 




SnA 








Sn 2 




c„ 


<> 




















w^^u 


w^u 


USTOMER 


-T 











< 


+ 1.1 


5> 


On+1 I 






'S'n+1.2 




Cn+l 






T 




<ll 








-L n- 


f-1,2 











CASE A: Two stage queues in series, unlimited interstage storage. 

Customer c„+i's total waiting time at the first stage and interarrival time at the second stage 

are given by 

fS„+i.i, ifr„+,.x>T^»' 

(1) WitlM 

r<»-r„+i.i+5'„+i,„ if r„+,i<w^^" 

and 



(2) 



n+l.2- 



,On+l, 1) 



ifr„+,i<t^i" 



The condition in (1) and (2) that T„+i^i>Wn^ simply means that c„+i arrives at server one 
after c^ has departed, and likewise Tn+\,i<C.Wn^ means c„+i arrives before c„ leaves. 
Similar to (1), c„+i's waiting time at the second stage is 



(3) 



W(2) — . 
•^'n+l-:— 



\Sn+K2, ifr„+1.2>m^' 



98 €. R. MITCHELL, A. S. PAULSON AND C. A, BESWICK 

The above diagram illustrates (1), (2), and (3) for r„+,.,>W^" and Tn+i.2<W'^\ Similar 
diagrams result for the remaining conditions. 

In an obvious way, we can use these relationships to build up a set of recursive formulae for 
any number of stages in series where the interstage storage between stages is unlimited. (See 
Appendix.) 

Since each customer must proceed through both stages, the output of the first stage becomes 
the input of the second stage and therefore we have, in steady state, that the time interval between 
arrivals at the second stage satisfies a Poisson process with the same interarrival intensity parameter 
X as the input distribution [3]. Unlike the first stage however, c„'s service time at the second stage 
is correlated with the interarrival time there. In the above diagram, this corresponds to a correla- 
tion between iS'„4.i.2 and Tn+i.2- This result is apparent from (2) since S„+i^i and /S'„+i.2 are de- 
pendent by assumption. 

If S„+i,i and S„+i_2 are independent as is usually assumed for two stage server systems, then 
each stage, in steady state, can be analyzed independently, and since T„+i,2 and Sn+1,2 are inde- 
pendent, as are T„+i,i and S„+i,i, the regular M/M/1 results obtain for each stage. 

Bhat [2] describes five different classes of single server first-come first-served systems with 
Poisson input and exponential service times which result from relaxing some of the assumptions 
of independence which are typically made. These classes represent more realistic operating systems 
than those with assumptions of independence; Bhat further points out that more work needs to 
be done on these problems than the limited amount reported at that time (1969). One of these 
classes is for systems with interdependent arrival and service processes as is the case here for 
Tn+1.2 and /S'„+i,2. 

Conolly [4] and Conolly and Hadidi [5] have studied a dependent structure somewhat similar 
to this wherein the ratio of service time to interarrival time is constant for all n; they give transient 
as well as steady state results for the system. Conolly showed numerically that this pattern of 
server behavior results in a drastic reduction in the mean and variance of the waiting time as 
compared with a conventional M/M/1 queue. It was noted by Conolly that this kind of server 
behavior is to be expected from a well regulated service facility where the server adjusts the service 
time of a customer according to that customer's interarrival time, which the server observes with- 
out error. In this way, a long interval gives rise to a long service time, and short intervals corre- 
sponding to a succession of rapid arrivals are followed by correspondingly short service times. This 
regulated behavior therefore prevents a long queue from forming and cuts down on the mean and 
variance of the waiting time in the system. 

Returning to the two stages in series problem under study we see that this system, via equa- 
tion (2), can be viewed as a type of self -regulated system since Sn+i_ 2 and T„+i_ 2 are related, al- 
though not in the deterministic way assumed by Conolly. It will be demonstrated later that our 
type of stochastic dependence between <S„+i. 2 and Tn+i. 2 gives rise to results which are consistent 
with ConoUy's. This artificial way of viewing the system as a self-regulating device is employed 
solely to make the effects seem more reasonable and in no way influences the results. 

CASE B: Two stage queues in series, no interstage storage. 

For this case, c„'s total waiting time at the second stage, W„<^', is always equal to the Sn. 2 so 
the only quantity of interest here is W„"'. Since there is restricted (zero) interstage storage, the 
phenomenon of blocking occurs and so the waiting time computation is a bit more complicated than 
in Case A. (In eflfect, the first server's utilization is diminished [17].) 



CORRELATED EXPONENTIAL SERVICE TIMES 



99 



The total wating time forc„+i at the first server is given by one of four relationships depending 
upon whether or not c„+i arrives at stage one before or after c„ leaves. 

For Tn+U l<Wn''\ 



(4) 



andfor r„+:, i>W„<^', 



Wl!U= 



fTr "' — Tn+i,i-hS„+i,i, if o„+i,i>»S'„,2 

\wi,''-T„+:,,+S„,2, if Sn+l.X<Sn.2 



(5) 



TJ/d) _l 
►* n + 1 — 1 



l'S'n+1, 1, 



if»S„+M>W<»-T„4-..l + 'Sn.2 



The following diagram illustrates (4) for ,S;,+i, i<»S„, 2- 



Wi. 



CUSTOMER 

Cn 



CUSTOMER 




< r„+i., > |^,a)^ ^^^^_^ 



BLOCKING 

OF SERVER 

1 



s 



n+1. 2 



Similarly, the other conditions can be verified. 

CASE C: Two stage queues in series, interstage storage capacity equals one. 

As in the previous case blocking can occur at the first stage but here a customer's total waiting 
time at stage two can exceed the service time since interstage storage is permitted on a restricted 
basis. 

If r„^.,i<w^^', 



(6) 



'' n + 1 



{WL'^-T,+,,,+Sn+^.u if^„+.,i>W^f-.S..2 



and c„+i's interarrival time at server two is 



(7) 



n+1, 2- 






If r„+,i>W<», 



(8) 

"and 

(9) 



W<" — 

" n + 1 — 



•Jn+l, 1) 



if'S„+,,:>W^i"-7'n+i.i+W'f-'S'„. 



^n+1.2 — 



W^^" -r„+i, ,+-pFf -5„+2, if S„+,, ,<Wl!^ -r„+i, i+PFf -S„, 

fr„+,,,-w^<"+5'„+,„ifs„+i,i>w^»'-r„+i.i+-H^f-<s„.2 



n7(2)_Cr 



iiSn+r.l<Wl,'^-T„+y^ + Wir'-S„.2 



100 



C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 



Next the total waiting time of Cn+i at stage two, Wnli, is computed by using (3) in Case A 
with Tn+i.2 as defined in (7) or (9). 

The following diagram is descriptive of (8) and (9) where Tn+i.i>.Wn+i and 



CUSTOMER 



CUSTOMER 

Cn+l 



W 



(1) 



Sn.\ 



'Tn+\.r 



W), 



w^;u 



J-n + \.,2 ~ 



s. 



n.2 






"^n + 1 



^n+l. 



3. A BIVARIATE EXPONENTIAL DISTRIBUTION 

There are a number of bivariate exponential distributions which could be used to describe 
the dependence assumed between Si and S2 (we drop the subscripts n for now). We choose to use 
a special case of the bivariate gamma distribution discussed by Wicksell [18] and Kibble [12] and 
more generally by Krishnamoorthy and Parthasarty [14] and Paulson [16]. 
The functional form can be written as 

(10) y(,„,,)=^^ ,-.,*-.«/. ((2 1^)^) 
where Si>0, S2>0, and 

is the modified Bessell function of the first kind and order zero. Here a>0, c^>0, and a-\-d=l. 
(See Downton [7]) 

The actual simulation of variates 

- \s,/ 

from the distribution (10) and its generalization due to Paulson [16] is affected via 

(11) S^=Xi+ViX2+V,V2X^+ ■ ■ •; 
here Xj is a 2— vector of independent exponential variates with mean vector 

^02/ 

and the Vj are random 2X2 matrices which take on values in the set 

^0 0\ /I 0\ /O 0\ /I ON 
0/ \0 0/ \0 1/ \0 1 






'CO 
if( 



ins 



CORRELATED EXPONENTIAL SERVICE TIMES 101 

with probabilities a, b, c, d respectively. All the X/s and V/s are mutually independent. Note 
that eventually the product UVj will result in a matrix of zeroes and so with probability one S^ is 
represented by a finite sum [11, 13]. 

The bivariate random variable S„ in (11) has mean vector 



and covariance matrix 



(13) x;= 



Nia+b) 



f aA—hc '\ 



ad— be 



/H'2 



(M2)^ 



• 



and for the correlation p, — 2.5<p<Cl- 

4. SIMULATION RESULTS AND INTERPRETATION 

Simulated results are presented in this section for three cases of interstage storage capacity: 
(A) infinite (2=°°), (B) zero (2=1), and (C) one iq=2). For the infinite interstage storage case 
results will be given for two stages in series for various values of correlation and for two through 
twenty-five stages in series for correlation equal unity. The latter depicts how adding stages might 
affect system performance given correlation p>0; more precisely, it provides an envelope within 
which system performance will vary since for a fixed number of stages and utilization correlation 
unity provides an extremum and correlation zero provides another. In each of the three cases we 
allow infinite storage before stage one. In the ordinary case in which the correlation between 
paired service times is zero, a few steady state results are available for comparison purposes. 

We have taken the mean arrival rate to be unity and so the steady state utilization, v, at stage 
i is simply the mean service time jUj. It will suffice for our purposes to take ;lii=/h2=m since similar 
steady state behavior will obtain for mi^A'2- Furthermore, there do not seem to be many results 
available for puposes of comparison for Cases B and C when mi^M2- For X = l, our system perfor- 
mance measure of mean waiting time (queueing plus service) , is equivalent to the expected number 
lin the system. 

I CASE A: k stage queues in series, infinite interstage storage. (Graphs are labeled k Q for k- 
queues) . 

Steady state results for k stages in series with no correlation between pairs of service times are 
available [17] and we have that the expected number of customers at each stage is v/{l — v) and 
kv/{l — v) in the system. 

Figure 2 provides for two stages in series the mean waiting time at the second stage, Wn^\ for 
i'=0.75 and p = — 0.25, 0, 0.50, and 1.0. In this case the mean waiting time at the first stage is inde- 
Dendent of p since no blocking occurs, and hence it suffices to examine the mean waiting time at the 
5econd stage to determine the effects of correlated service times. In some of the simulation results 
>o follow we replicate, many times, runs of much shorter length ; here we choose to illustrate the 
xiean waiting time as a function of n with one very long run. Long runs, such as this one, may be 
considered as being composed of replications of smaller runs where the starting condition of a new 



102 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 



_(2) 
Wn A 




I I I I I I I I I L 



SYSTEM 

— (3, 24)/D=-. 25(6.31) 

— (3.05)/0=0 (6.04) 

— (2.78)o=.50 (5.79) 

— (2.40)/D=I.O (537) 



10 20 30 40 50 60 70 80 90 100 " 
CUSTOMERSdN THOUSANDS) 

Figure 2. — Mean waiting time at second stage, 2Q, q= ^ , v = 0.75. 

replication is the ending value of the previous replication [6]. From the figure it is clear that each 
graph tends to stabilize for increasing n in accordance with the law of large numbers. 

The numbers adjacent to the values of p in Figure 2 are the mean waiting times for the second 
stage and mean waiting times in the system after 100,000 service completions. We point out that for 
p = the mean waiting times at stages one and two are 2.99 and 3.05 respectively, and these are in 
close agreement with the expectation of 3.0 for this utilization. For P5^0, we see here a bonus at- 
tached to positive correlation in service times since system performance improves with increasing 
correlation. On the other hand, system performance deteriorates with negative correlation. 

Each figure like this one has a starting condition based on the mean waiting time from a pilot 
run and then we omit the waiting times of the first 1000 customers in the actual computations 
shown. 

Figure 3 gives system performance for different values of v and for p = 0, 1. These graphs are 
intended to show that there is no discernable effect due to correlation p>0 at utilization ;'=0.6 
but as V increases from 0.6 to 0.9 a definite trend appears. 

Figure 4 shows the ratio of mean time in the system for various values of p to the mean waiting 
time in the system at p = 0. These kinds of graphs are based on an average of 100 replications of 
1400 service completions (after an initialization of 400 service completions were discarded). 

The solid lines in Figures 4, 7, and 9 depict a smoothed fit to the actual data. Sampling varia- 
tion, of course, precludes the possibility of obtaining such a smooth fit without extremely long runs 
or extensive replications, but each curve was spot-checked to ascertain whether or not the fit was 
spurious. In no case was any substantial deviation recorded. 

Now we show how these effects are consistent with Conolly's [4] results for the case of ^-=0.9 
and correlation of p = 1.0 between the service times in the two stages. Conolly showed for his single 
server queueing system where the ratio of service time to the interarrival time was constant for all 
n, that for a utilization of 0.9 (the ratio) the mean waiting time (queueing plus service) was 2.71. 
For service time independent of interarrival time the steady state expectation, for this utilization, 
is 9.0. The interarrival time and service time in Conolly's system are perfectly correlated, whereas 
in our system the two service times are perfectly correlated. It is clear from equation (2) that the 
correlation between the interarrival time at the second stage and the service time there is less than 
one, and so the improvement in system performance for our system should be less than Conolly's 
(an elementary derivation shows the correlation to be vp or 0.9 in this case). We see from Figure 3 



I 



CORRELATED EXPONENTIAL SERVICE TIMES 



103 



_(2) 
10 

9 

e 

7 
6 



^ 



.■77(2) 



SYSTEM 



•(9.27)^ = (18.39) 



(d)i/=0.9 



^(4.I3)/>=I (13.17) 



J I I L 



_L 



J I I L 



100 200 250 300 500 

CUSTOMERS ( IN THOUSANDS) 

5 
4 
3 h (b)V=0.7 



a)v=0.6 



SYSTEM 
(l.53 )/3=0(3.03) 2 
(1.49)^ = 1(2.99) , 



J I L 



I 



W 



(c)i/ = 0.8 



SYSTEM 



(23e)/0=0 (4.74) 



(2.05)/3=l (4.39) 



_L 



J u 



100 



200 



O 100 200 
CUSTOMERS (IN THOUSANDS) 



SYSTEM 



J4.I5);0=0 (8.21) 
(2.84)/0 = l(6.87) 



CUSTOMERS (IN THOUSANDS) 

Figure 3. — Mean waiting lime at second stage, 2Q, g= «>. 



O 100 200 

CUSTOMERSdN THOUSANDS) 



I.I 



LlI 

> 
en 



u 1.0 



U 

(- 
en 

> 



LlI 



•-0 .7 
5< /J 




.3 -.2 



.1 .2 .3 .4 
CORRELATION 



V'.T 



I/=9 



.6 .7 .8 .9 1.0 p 



Figure 4. — Ratio of mean waiting times, 2Q, 5= » . 



104 



C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 



7(01 I 

'n 
10 

9 

8 

7 

6 

5 

4 



t 



J L 



J I I I I L 



100 



200 



300 



400 500 600 700 

CUSTOMERS (IN THOUSANDS) 



800 



900 



9.,. W^'' 



4.90 W^^\. 608) 

-(4) 
— 4.72W (.624) 

1^4.48 W^^\.657) 

^^ —(2) 

4.I5W""(.736) 



J I L 



1000 n 



Figure 5a. — Mean waiting time, 5Q, g= «>, p=1.0, >' = 0.9. 



Z — - — 
1=1 kf/d-J/) 



1.40 



1,20 



1.00 



.80 



.60 




</=7 



u-.e 



u-3 



10 15 20 

SERVERS IN SERIES 



25 



Figure 5b. — Ratio of mean waiting time with p=l to expected mean waiting time with p = 0, 2Q-25Q, q= ^ 

(based on 100,000 service completions) . 

that the mean waitnig time at the second stage after 100,000 customers is 4.13, and indeed the 
improvement is less. 

In Figure 5a we show the mean waiting time as a function of n for five stages in series where the 
service times are equal at each stage. The graphs are labeled H^'*' corresponding to the mean wait- 
ing time at stage k, k=l, 2, . . ., 5. We see that the mean waiting time W^^\ for the second stage, 
is consistent with the results in Figure 3. The results for W^^\ W'^\ and TF<^* suggests that further 
improvements in system performance occur over the p = case but the eflfect seems to approach 
a limit. The number in parenthesis to the right of Vv'-"^ is the ratio 

i=l 



CORRELATED EXPONENTIAL SERVICE TIMES 



105 



Figure 5b shows this ratio for two through twenty -five stages in series for p = l. These results were 
obtained by extending recursive formulae (1), (2), and (3). In this extreme case of correlation, 
adding stages has an effect on system performance which depends markedly on the utilization rate; 
e.g., for j'=0.7 system performance is improved through the first four stages and then is reduced. 
A utilization of 0.9 gives rise to much improved system performance through twenty-five stages. 

CASES B and C: Two stage queues in series, finite (including zero) interstage storage. 

For these cases the utilization is effectively reduced in value [17]. The maximum effective 
utilization is j'max=(g:+l)/(2+2) where the queue in stage two is limited to a length of g— 1 units. 
We consider the cases q = l and q = 2. 

Figure 6 shows the mean waiting time at the first stage for q=l and several values of v. For 
this case each customer's waiting time at the second stage is simply the service time there so we are 
concerned only with the waiting time process at stage one. Figure 7 shows, for stage one, the ratio 
of mean waiting time at stage one with p 7^0 to the mean waiting time at stage one with p = 0. 

Steady state results for the mean number of customers in the system, L, for p = 0, 5 = 1 and with 
utilization v are given in Morse [15]; we have that 



(14) 



Z=4j'(2-j'^)/((2 + v) (2-3^)). 



For ^=0.4, 0.5, and 0.6 and for p=0, the observed (expected) values of L are 1.55(1.53), 
2.87(2.80), and 7.80(7.57) respectively. The observed values are from Figure 6. 

For p9^0, again we see a dramatic effect in system performance. System performance deterior- 
ates as the correlation p increases through positive values and improves as p decreases through 
negative values. Hence when there is no storage allowed before stage two, the departure from in- 



II 

10 
9 

e 

7 



{c)i/ = 0.6 



SYSTEM 
(9.23)/3 = l (9.83) 



■{7.20)p=0 (7.80) 



t 



J 1 I I I I L 



J. 



_L 



_L 



I 



J- 



100 200 300 400 500 600 " 
CUSTOMERS (IN THOUSANDS) 



_(l) 

Wn I 
3 


3 

(a)l/=0.4 SYSTEM2 
. . _ (1.38)^=1 (1.78) 

(I.I5)/0=0 (1.55) 1 

1 1 1 1 ^ 


1 


(b)l. = 0.5 SYSTEM 


2 

1 


1 


(2.37/3 =0 (2.87) 

1 1 1 ^ 




50 150 n 
CUSTOMERSdN THOUSANDS) { 


50 150 " 
:USTOMERS(IN THOUSANDS) 



Figure 6. — Mean waiting time at first stage, 2Q, g= 1. 



106 



€. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 



2 13 



lijO 

. « 



2- 



12 



I.I 



10 - 



9 - 




J L 



J l__J I I I 



2 -I 12 3 4 5 6 
CORRELATION 



I I » . 



8 .9 10 



Figure 7. — Ratio of mean waiting times at first stage, 2Q, 9= 1. 



4 
3 
2 



c)»/=0.70 




(ll.3l)/3=0 

(I0.04)/o = l 
^(2) ri.l4/0=0 



200 300 400 500 600 
CUSTOMERS (IN THOUSANDS) 



700 



o)t/=0.6 



j[3.37)/0= 
~(3.l6)/3=0 



W 



(2)_/-94/3=l 




■\^I.I2^ = I 



.(7.84)/3=0 
(749)/3=l 



94/J=0 5 _ 



-(2) J "0/5=0 
'^ 05/3=1 



o 100 200 " O 

CUSTOMERSdN THOUSANDS) 



100 200 300 400 500 " 
CUSTOMERSdN THOUSANDS) 



Figure 8. — Mean waiting time at first stage, 2Q, g = 2. 



IS 



dependence results in significantly different steady state behaviors, especially as the value v 
approached. 

Finally, we consider the case q=2. Figure 8 shows the mean waiting time as a function of n 
at the first stage and the ending value for the mean waiting time in the second stage. The mean 
waiting time at stage two was very stable for all values of n so those values will not be illustrated. 
The effect for i/ = 0.6 is in the same direction as for 5 = ^ but reverses as v increases so that for values 
close to j/^ax the change in system performance is consistent with the 5'= 00 case; that is, improve- 
ment for p>0 and deterioration for p<0. Figure 9 shows the effect in this case for p?^0. 



»li( 



liej 



CORRELATED EXPONENTIAL SERVICE TIMES 107 

5. SPECTRAL ANALYSIS OF { W^<»} AND { W^''} 

In this section we analyze the time series { W^^ } and { Wn^ } and apply a nonparametric 
test to the ratio of certain estimated power spectra associated with them. 

Let {2,, i = l, 2, . . ., A'') be a realization of a stochastic process with mean n and autoco- 
variances 7*, k=l, 2, .... A study of a time series in terms of its autocovariances is referred 
to as a time domain analysis. Another type of analysis is concerned with the frequency content 
of the time series, namely spectral analysis [1, 8, 10]. The Fourier cosine transform of the auto- 
covariances 7o, 7i, 72, • • •, is called the power spectrum. Denoting the power spectrum by f(w), 
we can write 

(15) /(co) = - [70+2 X; 7* cos 2ir«il 0<w< V2 

v\_ k=l J 

and inverting /(w) we can express 7^ as 

(16) 7*= /(w) cos 2irwWaj, k=0, 1,2 



with 7ft estimated by 



where 



1 ^-* _ 

c*=;^ X) (Zt+k—2){zt—z), 






When k—0 we obtain the variance 70 of the process as the integral of the power spectrum: 

(17) yo=£'f{w)dc.. 

Thus the power spectrum can be considered as a decomposition of the variance at different 
frequencies. 

To get sample results which are statistically consistent we do not estimate the spectrum at 
a particular frequency but instead estimate the average power about the frequency of concern. 
The average power corresponds to weighting the autocovariances in the time domain and we 
typically estimate /(w) with the truncated estimate 

(18) / (w^)=- XoCo+2 2 ^*Ct cos 27r«)A; 

where uj=j/{2m), j=0, 1, 2, . . ., m and the weights X*, k=0, 1,2,.. ., m, form a so-called 
lag window. We choose the Blackman-Tukey "hamming" window, 

(19) X*=0.54-|-0.46 cosTrk/m, k=0, 1, 2, . . ., m 

In (18), the sample autocovariances c^+i, c„+2 are omitted since, for m sufficiently large, 

they should contribute little information. As a result, only m autocovariances need be calculated 
and savings in computation may be considerable. Considerable care must be used when selecting 



108 



C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 



m, however, because too large a value will increase the variance of the estimates and too small a 
value will not give enough resolution. 

Next we examine several sample power spectra associated with the simulated waiting times 
for the two-server infinite interstage storage case. We take the simulated values {Wi''' } , 

n=l, 2, . . ., N;i=l,2, 

to be time series where, as before, Wi^'' is the total waiting time, queueing plus service, of customer 
n at server i. Figure 10 shows a portion of the sample spectra for {WT'} and {Wn^}, 



n=l, 2, 



., 2000, 



and for correlation values of p=0, 0.25, 0.50, and 1.0. Utilization, v, is 0.90. The 2000 sample 
values were chosen from the end of a simulation run of length 30,000 to ensure that any possible 
effects of start-up conditions were eliminated. After making several pilot runs, m in equation (18) 
was set equal to 400. For p = 0.50 and p = 1.0 in Figure 10 it is obvious that the waiting times at 
the second server give rise to different spectra than the waiting times at the first server. 

Since the integral of the power spectrum measures the variance of the process and the area 
imder the sample spectrum should be indicative of the sample variance, we see that the effect of 
positive correlation is to reduce the variance of the waiting time process. Again this is consistent 
with Conolly's results [4] for the single server system in which a customer's service is completely 
determined by the length of the interarrival interval separating himself and his predecessor. For 
a utilization of 0.9, Conolly's system reduces the steady state variance of the waiting time from 
81, for the classic M/M/1 system, to 1.16; the sample variance associated with the waiting times 
at server 2 in Figure lOd for p=1.0 is 2.05 (the sample variance associated with the waiting times 
at server 1 is 60.1). Recall from Section 4 that the condition p=1.0 foT the correlation between 
a customer's service times at the two servers is equivalent to a correlation of v, or 0.9 in this case, 
between his interarrival time and service time at the second server. Therefore, the reduction in 
variance is consistent with Conolly's results since the corresponding correlation in his system is one. 




.2 -.1 



.2 .3 .4 .5 .6 
CORRELATION 



7 .8 .9 



Figure 9. — Ratio of mean waiting times in system, 2Q, q=2. 



l/=.68 



1.0 p 



CORRELATED EXPONENTIAL SERVICE TIMES 



109 




lOOOO 



1000 



SERVER 2 



5 * 



lOOOO 



.025 .050 .075 
FREQUENCY 




100 



1000 



100 



.025 .050 .075 ^ 
FREQUENCY 



lOOO 



\'^ — SERVER I 
Li c)^=0.50 




.025 .050 .075 '^ 
FREQUENCY 




.025 .050 .075 <^ 
FREQUENCY 



Figure 10. — A portion of waiting time spectra, 2Q, v = 0.9, q— <» , 



Next we develop a nonparametric test for the hypothesis that y'(a))=/'^Haj), 0<w<0.5, 
where /"'(w) represents the power spectrum at frequency w associated with the time series { W^'' }> 
n=l, 2, . . ., N; i=l, 2. The Blackman-Tukey "hamming" lag window in (19) gives rise to spectral 
estimates which are not independent, and so we employ the notion of equivalent independent esti- 
mates [10] which implies, for this window, that estimates are approximately independent if they are 
about 5/(4m) cycles apart. Since the estimates in (18) are separated by a basis frequency of l/(2m) 
cycles, this spacing of 5/(4m) cycles amounts to taking, as independent, those estimates which are 
separated by an interval of 2.5 times the basic frequency. Since the waiting times are not normally 
distributed and the assumption of normality is implicit in the development of equivalent inde- 
pendent estimates, we take this spacing of 2.5 times the basic frequency simply to be a rough guide. 
Actually, the normality assumption is more critical for making distributional assumptions about 
the spectral estimates than for the usage here. To select a practical spacing and to reduce any pos- 
sible effects of the normality assumption we take estimates at the frequenciesj7(2m), j = 1, 4,7,. . ., 
to be approximately independent (the spacing here is 3 times the basic frequency). Therefore, of the 
401 estimates in each spectrum partially illustrated in Figure 10, we take 134 estimates at the fre- 
quencies j^/SOO, j=l, 4, 7, . . ., 399, to be approximately independent. 

Now for each approximately independent estimate we can regard the ratio /"'(w)//'^'('^) as 
a Bernoulli trial (greater than one or less than one) and under the null hypothesis of homogeneity of 
the two spectra we can take as a test statistic the number of ratios which are less than unity. Figure 



no 



C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 




20 40 60 80 100 120 140 

INDEPENDENT ESTIMATES 



n(6.34) 




20 



40 60 80 100 120 

INDEPENDENT ESTIMATES 



140 



Figure 11. — Ratio of sample power spectral estimates, 2Q, v==0.9, q= <». 

11 shows the ratio for p = and p=0.25. Of the 134 approximately independent ratios in Figure 11a, 
64 are less than unity and in Figure lib, 43 of the 134 ratios are less than unity. Under the null 
hypothesis, a ratio greater than unity is as equally likely as a ratio less than unity, and taking a 
normal approximation to the implied binomial distribution, we have a probability of 0.31 associated 
with observing 64 or fewer ratios less than unity in 11a and a probability of .001 associated with 
observing 43 or fewer ratios less than unity in lib. Although not illustrated, the results for p=0,50 
and p = 1.0 are even more conclusive : for p=0.50, 23 of the 134 approximately independent estimates 
are less than unity, and for p = 1.0, none of the ratios is less than unity. Therefore, for the case 
presented here we have good statistical evidence that the power spectra associated with the waiting 
times at each server are not homogeneous for correlation p>0. We expect similar results for other 
values of correlation, utilization, and interstage storage to obtain. 

APPENDIX 

In this appendix we show how a set of recursive formulae for the waiting times can be con- 
structed for any number of queues in series where interstage storage is unlimited. 

Referring to the following diagram for c„ and c„+i's queueing and service times at stages two 
and three (a continuation of the diagram preceding equation (1) in the text), we see that c„+i's 
interarrival time at stage three is 



CORRELATED EXPONENTIAL SERVICE TIMES 



111 



Tn+1.3— I c( 

iOn+1.2, 



if r„.H.2<w^4^ 



w«' 



T^;: 



(3) 



CUSTOMER 



CUSTOMER 

Cn+l 



w; 



S 



n.2 



W 



(3) 



Sn.-c 






U/(3) 
" n + l 



-' n+1. 



,„(2) 



'S'n+1.2 W 

• ^n+1.3~ 



(3) 

n + 1 



'S'n+1.5 



5n+1.3, ifr„+M>Pr<^> 



3, 



Similar to equation (3) in the text, c„+i's waiting time at the third stage is 

"n+l — 

Comparing T'n+i.a and Tn+i.z we have, in general, for c„+i's interarrival time at stage i, %=2, 



Tn+\, i — 



s 



n+1, :-l> 



ifr„+,.,_.<w«-». 



Similarly, comparing Wnii, W^li and W^] 1 gives a general recursive formula for c„+i's wait- 
ing time at stage i, i=l, 2, . . ., 

TTTU) __['S'n+l. ti il Tn+l,i>Wn 

Thus, we can obtain the recursive formulae for any number of queues in series. 



REFERENCES 

[1] Anderson, T. W., The Statistical Analysis oj Time Series (John Wiley & Sons, Inc., New York, 

1971). 
[2] Bhat, U. N., "Queueing Systems with First-Order Dependence," Opsearch, 6, 1-24 (1969). 
[3] Burke, P. J., "The Output of a Queueing System," Operations Research, 4, 699-704 (1956). 
[4] ConoUy, B. W., "The Waiting Time Process for a Certain Correlated Queue," Operations 
, Research, 16„ 1006-1015 (1968). 

[5] ConoUy, B. W. and N. Hadidi, "A Correlated Queue," Journal of Applied Probability, 6, 

122-136 (1969). 
[6] Conway, R. W., "Some Tactical Problems in Digital Simulation," Management Science, 16, 

47-61 (1963). 
[7] Downton, F., "Bivariate Exponential Distributions in Reliability Theory," J. Royal Statist. 

Society, Series B, 33, 408-417 (1970). 
f8] Fishman, G. S. and P. J. Kiviat, "The Analysis of Simulation-Generated Time Series," 

Management Science, IS, 525-557 (1967). 



112 C. R. MITCHELL, A. S. PAULSON AND C. A. BESWICK 

[9] Jackson, R. R. P., "Queueing Systems with Phase Type Service," Operations Research, 5, 

109-120 (1954). 
[10] Jenkins, G. M., "General Considerations in the Analysis of Spectra," Technometrics, 3, 133- 

166 (1961). 
[11] Kesten, H., "Random Difference Equations and Renewal Theory for Products of Random 

Matrices," Acta Mathematica, 131, 207-248 (1973). 
[12] Kibble, W. F., "A Two-Variate Gamma Type Distribution," Sankhya, 5, 137-150 (1941). 
[13] Kohberger, R. C, "On Certain Multivariate Exponential Distributions," Ph. D. Thesis, 

Rensselaer Polytechnic Institute, Troy, New York (1975). 
[14] Krishnamoorthy, A. S. and M. Parthasarty, "A Multivariate Gamma Type Distribution," 

Ann. Math. Statist., 22, 549-557 (1951). 
[15] Morse, P. M., Queues, Inventories and Maintenance (John Wiley & Sons, Inc., New York, 

1958). 
[16] Paulson, A. S., "A Characterization of the Exponential Distribution and a Bivariate Expo- 
nential Distribution," Sankhya, Series A, 35, 69-78 (1973). 
[17] Saaty, T. L., Elements of Queueing Theory with Applications (McGraw-Hill: New YorK, 1961). 
[18] Wicksell, S. D., "On Correlation Functions of Type III," Biometrika, 25, 121-133 (1933). 



SINGLE-LANE BRIDGE SERVING TWO-LANE TRAFFIC 



Z. Eshcoli and I. Adiri 



Technion — Israel Institute of Technology 
Haifa, Israel 



ABSTRACT 



This paper presents a mathematical model of a single-lane bridge serving two- 
way traffic in alternating directions (with an FIFO rule observed within each direc- 
tional queue). While the bridge serves cars moving in one direction, cars approach- 
ing from the opposite direction wait in a queue at its foot. When cars in the current 
direction finish crossing the bridge, it begins serving cars from the other direction, 
if any are present. A newly-arrived car finding an empty bridge mounts it immedi- 
ately. Several cars moving in the same direction may occupy the bridge simul- 
taneously. The crossing speed is assumed to be constant, and the arrival processes 
in both directions are assumed to be independent, homogeneous Poisson processes. 
A generalization of the alternating-priority models [1, 2] is developed to arrive at 
the Laplace-Stieltjes transform and the expected value of the flow time (the time 
interval between the moments of arrival at the bridge and departure from it) for 
steady state conditions. The results are discussed and some examples are presented 
graphically. 



1. INTRODUCTION 

Cars arrive at a single-lane bridge in two opposite directions, numbered 1 and 2, according to 
independent homogeneous Poisson processes with arrival rates Xi and X2, respectively. A car going 
in direction i (i=l, 2), hereinafter referred to as "type-i car," may cross the bridge if the latter 
is empty or carrying cars of the same type; otherwise (i.e., if cars of the other type are crossing) 
it must wait in a queue at the foot of the bridge. 

The time interval between the moment when the bridge starts serving type-i cars and the 
moment when it carries none of them is called "type-i phase." At the end of type-i phase {i=l, 2), 
either the bridge is empty or type-j cars (jVn) have queued up at its foot. This queue, called the 
initial queue, now mounts the bridge in order of arrival. The crossing speed is assumed constant, 
hence the crossing time is also constant. This assumption is not unrealistic since usually the speed 
limit on such a narrow bridge is far below the free speed of modern cars. 

When there is no queue of type-i cars but the bridge still carries them, a newly-arrived type-i 
car does not have to wait, thus spending only the crossing time in the system. 

The crucial difference between the above described and an alternating-priorities queueing 
model ([1, 2, Chapter 9]) is that here the service facility (the bridge) can accommodate several 

113 



114 Z. ESHCOLI AND I. ADIRI 

customers (cars) simultaneously; accordingly, the term "service" will need a special definition, 
given later on. 

Our aim is to find the steady-state distribution and the expectation of the flow time (the time 
interval between the moments of arrival at the bridge and departure from it) of a car in the system 
as a function of the bridge's length (or equivalently the time spent crossing the bridge). Although 
our analysis is based on [1] and [2], these models may be derived as special cases of the model dis- 
cussed in this paper. The situation described above is not limited to the case of a narrow bridge, 
but is applicable to any one-lane road servicing two directional traffic, a common situation when 
repairing a road. Assume that a two-lane road of given length has to be repaired. The repair will 
be done by choosing one lane for traffic alternatively. The maintenance chief would like to do the 
work in one stretch to save set-up costs; on the other hand, this policy creates long queues and in 
turn high flow times. Having the necessary cost data, the results obtained may serve as a guideline 
in determining the optimal partition for the repair of the road. 

At the end a discussion of the model is given, and the expected steady-state flow time as a func- 
tion of the length of the bridge, for specific parameters, is presented graphically. 

Related models were studied by several authors. Darroch, Newell and Morris [3] considered 
a model in which a vehicle-actuated traffic light controls two intersecting traffic streams. The 
light is kept green for lane i {i=l, 2) until any existing queue of type-i caxs has been discharged 
(the "discharging time"), and further until a headway of duration at least /3, is detected in the 
subsequent arrivals (the "extension time"). The main difference between [3] and our model is 
that in [3] the discharging time and the extension time are independent random variables. Thus 
the light may change to green for lane-i even if there are no cars present in this lane, and this light 
stays green only for the extension time during which cars arriving from the intersecting lane have 
to wait even though no type-i cars are present. This case cannot happen in our model where a busy 
bridge is available for type-i cars iff type-i cars are crossing it. In another paper, Hawkes [4] assumed 
generally distributed crossing times and alternating priorities discipline. The expected waiting 
time of a type-i car was calculated and this result, obviously, coincides with the result in [1] and 
subsequently may be obtained as a special case of our model. Tanner [5] discussed a similar model 
in which the crossing times were also constants but the queue discipline was different: A type-i 
car could cross the bridge if there were no type-j cars (jVi) on the bridge and the last type-i car 
has started crossing at least jS^ time units ago. An explicit formula for the expected waiting time 
of a type-i car was presented only when /3i = or ^ = 0. 

2. MATHEMATICAL MODEL 

2.1. Basic Relations 

Let Yi (i=l, 2) be the mounting time* of a type-i car with the exception of a type-i car 
initiating a type-i phase which has F°, as its mounting time, and let Si be the constant crossing 
time of a type-i car. The mounting times are assumed to be non-negative arbitrarily-distributed 
random variables independent of each other and of the interarrival times, and possessing finite 
second moments. The crossing times are assumed to be finite positive constants. Let the arrival 



♦Defined as the time interval between the moments two successive cars (present in the system) begin to cross 
the bridge. 



SINGLE-LANE BRIDGE 115 

process of cars in direction i (i=l, 2) be a homogeneous Poisson with average of X^ cars per time 
unit. 

We denote: 

(1) X = Xi+X2, 
and 

(2) p,=\,E(Yd i=l,2, 
then under the above assumptions the system is non-saturated if: 

(3) P = PI + P2<1, 

see Section 4. 

At steady --state the system undergoes cycles of length Tc, the components of each cycle being 
an idle period Ta (the time period during which the bridge is empty), and a busy period T^ (the 
time interval between two successive idle periods) . Thus : 

(4) T,=T,+T. 

Two types of busy periods are observed: Tb, and Tbi where Tt,j{j=\, 2) starts with the arrival of a 
type-ji car to an empty bridge and terminates when the system is empty. Hence: 

(5) E{T,) = E{T:) + E{T,)=\+^ E{Tb)+^^ EiTb,). 

Following the notation in [2] T^} may be subdivided into subcycles of successive flows in alternating 
directions denoted by 2\^, 

(6) Tbi=f:T,„ i=l,2. 

Every subcycle T^j (except when k=j=l) comprises two successive phases of flow and counter- 
flow as follows : 

-111=2^111 

(7) r„=Tj.,_,.i+ria, ^>i 

where Ti^j is the k-ih phase of flow of type-i cars in a type-^ busy period. 
Thus, the two types of busy periods may be described as follows : 

Type-1 

"bases: Tuj | i2n I ?^i2i I -t22) I -tiai I • ■ • I 3^2, t-i. i I ^ui I • • • 



Subcycles: Tn \ T,, | T31 | | T,, 

< T,^ 



116 Z. ESHCOLI AND I. ADIRI 

Type-2 

Phases: T212 I ^'112 I ^'222 ^122 I • • • I 



■ lk2 



Subcycles: T-i^ 



Each phase Ti^j may in turn be divided into two subphases: 

(8) Ti^j=Tii^j-\-I ikj, I, J = l, 2; K^\, 2, . . ., 

the first of which, TH), begins when the type-'i car initiating the phase mounts the bridge and 
terminates when there is no queue in direction i (there are type i cars on the bridge) ; the second, 
Tl^ki, immediately follows the first and terminates when there are no more type-i cars in the 
system, (in queue and crossing the bridge). 

In the first subphase the "service" consists in mounting the bridge, so that the service times for 
type-i cars are independent r.v. distributed as Fj except for the initiator which has Y° i as its service 
time. 

In the second subphase newly-arrived cars mount the bridge without waiting so long as there 
are cars of the same type on the bridge. The "service" thus consists in crossing the bridge, so that 
type-i cars have a constant service time denoted by Si. Note that (S'^(^ = 1, 2) is determined by the 
length of the bridge and the crossing speed limit, the latter being lower, by assumption, than the 
free speed of type-i cars. 

Let Wi{i=\, 2) be the flow time of a type-i car (i.e., the time from arrival until departure), 
and W the flow time of an arbitrary car, then : 

(9*) L^{z)=^Lw,{z)+^LwAz). 

Due to symmetry, Lwi{z) may be derived from Lwi{z) by changing indices, thus without loss of 
generality, only type-1 cars need be considered. 

A type-1 car arrives at either an empty or a busy bridge. In the first case its flow time is Si ; In the 
second, let U^j be its flow time if it arrives in T^j{j=l, 2; k = l, 2, . . .,). Hence, following [1], 
we have : 

h.tL{-lc) [ k = l k = l I 

In the next section the L.S. transform and the expected value of a phase are found which yield (in 
view of (7)) E{T^,) and E{T^2)- Following in 2.3 E{Tc) is computed and finally in 2.4 we turn to 
find Lu.iiz). Combining these results, using (10), yields the L.S. transform of the flow time of a 
type-1 car. 



*For a non-negative r.v. X having c.d.f. Fx(-), 

Lx{z)=j\-"dFx{x) 
denotes its Laplace-Stieltjes (L.S.) transform. 



SINGLE-LANE BRIDGE ' 117 

2.2. Length of Phase 

Consider T^i, k^l. Clearly ri|'i=0 whenever T^]^,=0 so that for Ar>l we have: 

where Ti\\ and T/^'i are defined and discussed by equation (8) and its sequel. 

If the first component, Ti"i, is positive it can be treated as a busy period in an M/G/1 model 
where the service time is Yi, and the service time of the "first customer" being the time required by 
the initial queue of type-1 cars, formed during T2.k-i.\ to mount the bridge. Therefore. 

(12) Ti\\=±:t„ 

i = l 

where ti is the mounting time required by the initial queue and tt (t>l) is the time required by the 
cars arrived during ^_i to mount the bridge. 
By well-known results ([2, p. 151]) : 

(13) Z^m (2)=:i,,(2 + X.-XiLi(2)), k>l 

where 

(14) Zi(2)=Zy,(2 + Xi-XiZ,(2)), 

Let N2,k-\.i be the number of type-1 cars arrived during Ta.^-n then: 

iLY^{z)[LYXz)r-\n>0 



(15) Eie-''^\T2.k-i.x=t,N2.k-i.,=n): .^ ^. 

[1 J TV — \j 

Withdrawing the conditions in (15) we have: 

(16) ij2)=in.-,,,(Xi)+^r^ (ZT...,..(Xi-X.iy,(2))-iT..»-,.,(X,)), ^>1. 
Substituting (16) in (13) we obtain: 

(17) Z^(n(3)=ZT.._,.XXi)+:^^^^^^^^^t^g^^ k>\ 

where 2.1(2) is given by (14). 

As for the second subphase in T^^i '• let 

and let M^'i be the number of type-1 cars arrived during Ti^'i. Denote by t„i, m>l, the time 
elapsed between the moments of arrival of the (m-l)-th and the m-th car in Z","']. t^x are independ- 
ent r. v. distributed as ti. ti<S'i because otherwise Zjl'i would terminate. Hence the densit}' func- 
tion of Ti takes the form: 

(18) " /n(«)=l^^/ Q<t<S. 



118 



Z. ESHCOLI AND I. ADIRI 



A type-1 car arriving at the system during Til'i could cross the bridge iff the preceding 
type-1 car were still on it — an event which occurs with probability 1 — e~^'^', hence: 

(19) P(M,Tl=w)-=(l-e-^'^')"e"^'^ n>0; k>l. 

Now for k'^l, 



(20) 



7'(2) _J 



MiTi 

m = l 

Su MfA=0. 



Expressing (20) in terms of L.S. transform we arrive at: 

Xl + 2 



^^iTi(^)= X:+36^^-+-)^- ' ^>^- 



Af' ^' -1 
■'" in ^ 

-?^iii = -riii = Oi+ ^ Tml, 



m=l 



(21) 

When /: = ! it is true that: 

(22) 

but 

(23) P(M,Ti=n) = (l-e-^'^")"-ie-^>^', n=l, 2, . . .,. 

(It is clear from the definition that at least one type-1 car appears during Tm.) Therefore: 

Xl + 2 



(24) 



iT,„(2) = 



Xi+2e(x.+z)s, 



When Ar>l, Tui=0 iff in the preceding phase, T2,ic-i.i, there were no arrivals of type-1 cars. The 
probability of this event is : 



(25) 



(T, 



-'^L 



e-^"c^Fr,,.,,,(0=in»-...(Xi). k>l 



By (11), (25) and the independence of Ti'^i\Ti\\>0 and T.Ti: 

(26) S(e-^.'0=Zr..-,.XX,) + (l-LT..-,.XXi))iT<?,(2)£:(e-^^iTi|r.'i\>O).Ar>]^ 

But 



(27) i?(e-^iTi|Tffi>0) = J^" ^"^'«'^T<»,|^<i\>o(0 = J^° c-'(i 



1— P(Tn,=0)J 



Substituting (27) in (26) yields: 

(28) iru,(2)=i^n.-,,,(Xi)+iTf,',(2)(ZT<i'^(0)-LT.,....,(Xi)),^>l. 

From (17), (21) and (28) we finally obtain: 

(29) 



i'T„i(2)=I'r2.»-,,,(Xi)- 



Xi + 2 Zyo^S + Xi — XiZ,(2)) 



Xj + 2e(X.+^)S, 



i:(3) 



(-2'n.-,,.(Xi-XiL,(2))-ZT...-..,(Xi)),A:>l, 



SINGLE-LANE BRIDGE 119 

where Zi (2)= Ly, (2 +^1—^1^1 (2))- 

Similarly, by symmetry, we have: 

(30) Lt,.M)=Lt,.M+ ^^_^'J;,1.,s^'^'^'^^^^^^ k>l, 

where 

i2(2)=iK,(2+X2 — XzZzCs)). 

Theoretically, LritM) is obtainable recursively from the above relations: Lrmiz) is given by 
(24); substituting Lrmiz) in (30) for k=l yields Lrmiz)', substituting Lrjui^) in (29) for k=2 
yields Lt,,,(z), etc. Lrn^iz) are obtained by change of indices in Lr.tii^)- 

Differentiating (24), (29) and (30) with respect to 2 at 2=0, yields the expected value of the 
length of a phase : 



(31) E{Tni)- 



X, 



1— Pi \ i— Pi Xi / 

(33) E{T,,,)^j^- g(rui)+( ^^-r°^~-^^^ +^^^) (l-Zru,(X.)), A:>1. 

1— P2 \ 1 — P2 X2 / 

Solving the set (31), (32), and (33), we have: 

(34) E{T,,,)=E{Tu.)r'-'+( '^^\ ' '^ +^— ^) S ^^-V"" (l--^r..,(XO) 

\ 1— Pi A] / m=l 

1 — Pi \ I — P2 A2 / m=l 

and 

(35) E{T^,}=-^ EiTndr'-'+j^^ / E{T\-YO _^e^^\ g ^_i_„ (i_2:^^__(x0) 

i — P2 i — P2 \ 1 — Pi Ai / m = l 

\ 1 — P2 ^2 / m = l 

where 

P1P2 



(1-p,) (I-P2) 
and £'(r„,) is given by (31). 

Substituting (31), (34) and (35) in (7) yields the expected length of the subcycles E(Tt\), 
k>l. Thus, to find the Laplace-Stieltjes transform of the flow time of a type-i car, it is left to 
find EiTc) and Lv^^iz), ?:=1, 2; k=l, 2, . . . ., In the next section we calculate E(Tc). 

2.3. Expected Length of Cycle 

Let Nij (i, j=l, 2) be the number of type-t phases in type-j' busy period. The distributions 
and expectations of these r.v. were found in [2, p. 199]: 

(36) P(Nu=l)=P{Tr,,=0)= j\-^^^dFT,At)=LT,A^x), 



and 



120 Z- ESHCOLI AND I. ADIRI 

and 

(37) 

P {Nn=k) =P {Tui>0, n ,+,. 1=0) = P (T,, ,+,. ,=0)-P(Tux=0) =Lr„, (X,) -ir. »-,. , (M , ^>1. 
Hence: 

(38) EiNn) = l + t: (1-Ztu>(Xi)) 

k=l 

and similarly: 

(39) ^(A^2.) = f:(l-XTu,(A2)). 

fc=l 

The expected values of N12 and A7^22 are obtained from (38) and (39) by change of indices. 
From (31), (32), (33), (38) and (39), we have: 

(40) g E(T„.)=f^|4^+(^(^^+4^).£(iV,.-l) 

1— Pi \ I—P2 A2 / J 

Hence : 

(42) Ein,)^-=± [E{T,,,)+E{T,,,)]=^ U E{Y\-Y,) _^e^^^\ ^^^^^^ 

k=\ 1 — P [\ 1 — Pi A] / 

+E(Y'.-Y.m^^..) (^<M)+^) E(N„) j. 

E{Tb^ is obtained from (42) by change of indices. 
Finally from (42) and (5) : 

(43) E{T,) = \+^j^^ {Xi(l-p,) |^ g(F°i-F.) ^g^^l^ E{Nn)+\,E{Y,-Y°,) 

+x.a-PO (^^p^+q=:^) ^'(A^.)+Xi(i-p.) (^ifa^+^^^ 

\ l — Pl Xi / \ 1— P2 X2 / 

•£:(iV2i)+X2(l-P2) ( ^^"^X^'^ ^^^^) E{N,,)+\,E{Y2-Y°2)\ 

where EiNij), i, j=l, 2, are given by equations (38) and (39). 

Substituting E(T,) and the previous results in L^iiz) (equation (10)), it still remains to find the 

L.S. transform of the flow time of a type-1 car in a subcycle. This is done in the next section. 



SESTGLE-LANE BRIDGE 121 

2.4. Flow Time 

Consider 

The flow time of a type-1 car in Tfl^ is -Si. Let U'ki A:>1 be the flow time of a type-1 car in 

(44) r=r2.*-i.i+r,Ti 

Tmay be treated as a delayed busy period in an M/G/1 model where T2,k-i.i is the delay interval 
and TIW is treated as in (12). Using known results for the flow time in an M/G/1 model where 
the busy period is delayed and the first customer has a different service time ([2, p. 153]): 

(45) Lv'.Xz)= £(n,_,:+r,T0(Xiir.(2)-x.+2) ''^^■ 

By (11) and (25) we have: 

(46) ZuM(3) = ^^^^'^-'y^^^''"'^ Lt7'»(3) + (l-i:r.>-,..(X.)) ^^e-'"> k>l. 

Substituting (45) in (46) yields: 

e-S'^ fiy,(2)(l-^T..-,,.(Xi))-i:y»,(2)(LT,...,.,(2)-iT,,.-,..(X,)) 



(47) Zu.(3) = -^(y^^) ^ X^LyXB)-X^ + Z 

+^^(l-^r..-,.,(Xi))[,t>l 
and similarly: 

n,{lk2) [ \iivy,(2) — Xi + 2 \\ ] 

By definition: 

(49) Z[;„(2)=e-^'^ 

Substituting (47), (48) and (49) in (10), we fuially obtain the Laplace transform of the flow time 
for a type-1 car: 

(50) iH',(2)=r^^ |1+'-^^ {\E{,Nn)+\2E{N,,)) + ^-. , ] , , ^ ±;[^F.(^)(Xi(l-i:ru.(Xi)) 

A£L{1 c) I Ai Alivy,(,2)— Ali-2 t = i 

+X2(l-ZT,.,(X0))-iy°,(2)(Xi(LT,„(2)-LT.„(Xi))+X2(Zr.».(2)-i:r,.,(X,)))]} 
Differentiating (50) with respect to z at 2=0 we obtain the expected flow time of a typc-1 car: 

(51) E(W,)=S,+^^^^j~^^^-—^-, {(1-p,) ±[X,E{TU+\,E{Tl,,)]+{\,E{Yr') 

+2(l-p,)E{Y°,))±,[X,E{T2Hi)+X2E{T,,,)] + {\,E{Y,')E{Y°,-Y,) 

+ {l-pi)EiY''rr-Y^)){\,E{Nu-l)+\2E{Nn))] 



122 Z. ESHCOLI AND I. ADIRI 

Now 

»=i 
is given by (41) ; 

k=l 

is obtained from (40) by change of indices; differentiating (30) twice with respect to z at 2=0, we 
have E{Tl)^i); differentiating (29) twice with respect to z at 2 = and changing indices, we have 
E{Tl,c2)- These results are to be substituted in (51) to yield the expected value of the flow time 
of a type-1 car. 

3, SPECIAL CASES 

3.1 FYoAy)^FYXy), 1=^1,2. 

We assumed that the first type-i car (i=l, 2) in each subcycle has a different mounting time. 
This is usually the case in real life. However, assuming that the mounting times of all type-i cars 
are identically distributed simplifies the equations and may provide adequate approximations. 

Hence replacing Y'°i by Yt (i=l, 2) in (40) and (41) and changing indices in (40), we have: 

(52) i:i?(r,.,)= »-'';'"-'"> 1^ 5^ E(N„)+'^ £w,) 

fc=l 1 — P li — P2 Ai A2 

(53) i:£(r,.).= <'-7>"-'"M '^t^£(A'.)+,-''^-^g(A-,.) 

)c = l 1 — P I A2 1 — P2 Al 

Differentiating (24), (29) and (30) twice with respect to z at 2=0 we obtain difference equations for 
the second moments of the phase length. 

Summing these equations we obtain two independent equations for 

k=l 

and 

k = l 

whose solution is : 

(54) |:^(r..)^^,{(^j^^.^)S.(n.,)+ ---n7"^''-' ^''^'--'l 

(55) 

^-rb(iJ-JI(^,'gg^a-.^)S^<^->^ ""^''":7'"^'^-' ^'^-">l- 



SINGLE-LANE BRIDGE 123 

Replacing F°< by Y( (t=l, 2) in (51), we obtain: 



+ {\,E(Y,')+2{l-pr)E{Y0) S i^.E{T,,,)+\2E{T2,2))\ 



/i = l 



Now 



is given by (55) ; 



k=l 



JlEiTh,) 

k = l 



is obtained from (54) by change of indices ; 

±E{Tu.) 

k = l 

and 

S E{ T2k2) 
m = l 

are given in (52) and (53). Substituting these values in (56), we have the expected flow time of a 
type-1 car in terms of the known parameters of the system. £^(^2) is obtained from (56) by change 
of indices, and E{'W) is obtained from (9). 

3.2. Alternating Priorities 

If the crossing times are negligible then the problem of simultaneous service disappears and 
the only service that cars receive at the bridge is mounting it. Hence substituting <S'i=0, i=l, 2, 
in the above discussion we obtain the known formulas for the L.S. transform and the expected 
value of the flow time under alternating-priorities rule ([1, 2]). Furthermore, the case of alternating- 
priorities with set-up times may also be derived if the first service in each phase, Y°i, i=l, 2, is 
decomposed into a sum of the (independent) set-up time and the "ordinary" service time. 

4. DISCUSSION 

4.1. Non- saturation Conditions 

The distribution of Tj'^'i ^>1 is independent of k (equation (21)). In the same manner it can 
be proved that T'^^,, Ar>l, is independent of k and j. 
Define: 

(57) F°i=F°i+ r«> , F°2= r°2+ r«> 

With this definition, ignoring the first and last phases in each busy period, our model becomes 
. identical, as far as saturation is concerned, to an alternating priorities model in which the service 
time for the i-th priority class (i=l, 2) is distributed as F,, except for the first customer in a phase 
whose service time is distributed- as Y°i. (Obviously, the first and last phases in each busy period 
and the different service time of the first customer do not aflfect saturation.) Hence the non-satura- 



124 Z. ESHCOLI AND I. ADIRI 

tion condition for our model is the same as that for the above described alternating priorities model 
([1, 2, Chapter 9]), namely the condition stated in (3). 

4.2. Mounting Times 

When T^j ^>1 starts the queue of type-i cars formed at the foot of the bridge during the 
previous phase mounts the bridge. The mounting time of a car was defined as the time elapsed 
between the moments when two consecutive cars present in the system begin to cross the bridge. 
Clearly, the mounting time is the time needed to pass a distance equal to the length of a car and a 
minimal safety distance between two consecutive cars. (In the subcycles Tlfji, j=l, 2, k>l the 
minimal safety distance is not necessarily kept.) The mounting time of the first car in a phase does 
not share the same distribution because the first car has some additional preparations to make 
before mounting the bridge and here no overlapping activities are possible. 

4.3. Graphical Representation 

It is assumed, for simplicity, that the two priority classes differ only in their arrival rates, i.e., 
they have the same crossing time S, and their mounting time is distributed as Y including the first 
car in a phase. 

Furthermore let us denote : 
A = bridge's length 
<^= crossing velocity (constant) 

c= car's length (assumed to be uniformly distributed on the interval (3, 5)) 
D = safety distance 
)/'= mounting velocity (constant) 
Distances are measured in meters and the unit of time is a minute. The crossing velocity was as- 
sumed to be constant, as a direct consequence the crossing time S=A/<I) is constant too. Assume 
further that \p is constant, then in view of 4.2, Y must be proportional to 

c-\-D 



and in fact, equality was assumed. Figure 1 shows the behaviour of the expected flow times E{Wi) 
and E(W2) as a function of the bridge's length — A, with Xi = 5 cars/min., X2 = 10 cars/min., D = 3 
meters, i/'=150 meters/min., (/)=500 meters/min., £'(F) = 0.47 min., p=.7. 

Since Xi<^X2, it takes more time to clear the bridge of type-2 cars, thus accounting for the greater 
expected flow time of type-1 cars. 



SINGLE-LANE BRIDGE 



125 




50 100 

BRIDGE'S LENGTH (IN METERS) 

Figure 1. — Expected flow time as a function of the bridge's length. 



REFERENCES 

[1] Avi-Itzhak, B., W. L. Maxwell and L. W. Miller, "Queueing with Alternating Priorities," 

Operations Research, IS, 306-318 (1965). 
[2] Conway, R. W., W. L. Maxwell and L. W. Miller, Theory of Scheduling, (Addison-Wesley, 1967). 
[3] Darroch, J. N., G. F. Newell and R. W. J. Morris, "Queueing for a Vehicle-Actuated Traffic 

Light," Operations Research, 12, 882-895 (1964). 
[4] Hawkes, A. G., "Queueing at Traffic Intersections," Proceedings, Second Symposium on the 

Theory of Traffic Flow, London (1963). 
[5] Tamer, J. C, "A Problem of Interference Between Two Queues," Biometrika, 40, 58-69 (1953). 



OPTIMAL CONTROL FOR MULTI-SERVERS QUEUEING SYSTEMS 

UNDER PERIODIC REVIEW* 



C. C. Huang, t S. L. Brumelle and K. Sawaki 

University of British Columbia 
Vancouver, B.C., Canada 

I. Vertinsky 

International Institute of Management 
Berlin, Germany 



ABSTRACT 

This paper deals with the prcblem of finding the optimal dynamic operating 
policy for an M/M/S queue. The s:^stem is observed periodically, and at the be- 
ginning of each period the system controller selects the number of service units to 
be kept open during that period. The optimality criterion used is the total dis- 
counted cost over a finite horizon. 



I 



INTRODUCTION 

Most of the related studies reported recently in the literature focused upon controls of one- 
server queues over infinite horizon (Heyman [3], Bell [2], Sobel [5], Balachandran [1]). 

Zacks and Yadin [6] dealt with the case of finite horizon for M/M/1 sj^stem with variable service 
intensity under non periodic review. In their model decision epochs occur immediately after changes 
in queue size. They identify an "optimal" policy resorting to a conjecture and imposing restrictive 
assumptions. Magazine [4] studied M/M/S queues with finite waiting capacity under periodical 
review and convex increasing holding costs over finite ho^zons. 

All the results in Magazine [4] are derived on the ba is of (1) a misspecified holding cost function 
which ignores the fact that customers may be turned away when waiting room capacity is saturated, 
and (2) an incorrect argument that the distribution of number served by the i'" server in an m-server 
system is identical to the distribution in an n-server system where n^^m. 

This paper considers a similar model to the one outlined by Magazine with two major modi- 
fications : 

(1) Waiting room capacity is taken to be infinite. 

(2) The cost structure is generalized to permit different holding costs in different periods. 



*The research reported in this paper was supported in part by the Defence Research Board of Canada and 
!The International Institute of Management, Berlin. 

tCurrently affiliated with Memorial Universitj' of Newfoundland, St. John's, Newfoundland, Canada. 

127 



128 C. C. HUANG, S. L. BRUMELLE, K. SAWAKI AND I. VERTINSKY 

MODEL FORMULATION 
System Structure 

(i) Assume there are s servers in parallel. 

(ii) Assume a Poisson arrival stream. 

(iii) Assume that the decision points are at equally spaced time intervals. Without loss of 
generality, we assume that these points are 0, 1, 2, .... 

(iv) Assume that the service times have independent and identical exponential distributions 
and are independent of the arrival process. 

Cost Structure 

Let aSC^I/:) denote the cost of changing from k to I open servers at a decision epoch, and then 
operating the I servers until the next decision epoch. We assume that S(l\k) has the following 
properties : 

(i) S(l\k) is convex in I for each k ; 

(ii) For fixed j and I with Z<ji, the function Sij\k) — S{l\k) is nonincreasing in k, and equal to 

S(j\k-l)-Sil\k-l) for k<l or k-l>j. 

For example, we might take 

S(l\k) = (l-k)+A+(k-l)+B+lC 

where A is the cost of opening a closed server, B is the cost of closing an open server, C is the cost of i 
operating an open server for one period, and i 

A:+ = max (k, 0). 

It is easy to check that S{l\k) of this form satisfies (i) and (ii) . 

In addition to the above switching and operating costs, we include a customer holding cost. 
Let Kn(i) denote the cost incurred during period n if it ends with i customers in the system. 

Markov Decision Structure 

Define 0„(i, a, k) as the minimum expected cost with i customers present, k servers open and n 
periods remaining in the horizon, using discount factor a, 0<a<l. We take the length of time be- 
tween decision epochs to be the unit of time. The recursive relationships are 



where 



4>n{i, OL, k) = mv[v4/„{l\i, a, k) n=\, 2, ■ ■ • , N 

Q<l<s 



^l^M\i,a,k) = Sil\k) + ^P,j'[K„{j) + a<l>„-,{j,a, l)],Ui,oc,k)=0, 



Z„ is the number of customers in the system at the beginning of period n, and 

Pij^=[Zn=j\Z„+i = ir\l servers are open]. 



QUEUEING SYSTEMS UNDER PERIODIC REVIEW 129 

Policy Structure 

For each period index n and k=0, 1, 2, • • •, s define 

I„{l\k)^{i:<i>n{i,a,k)=Ml\i,a,k)], 1^0, 1, 2, - • ■ , s. 

Define a policy tt by 

ir(n, i, k)=l if ieln{l\k) 

with the interpretation: If there are k servers open and i customers present at the nth decision 
epoch and iel(l\k), then change the number of servers open to I. 

The optimality of the above poUcy is obvious, since for any n, k and i, one can evaluate 
^„(^|i, a, k) for 1=0, 1, • • •, s and assign the i to the appropriate set In{l\k) which by definition will 
yield the minimum expected cost. 

It is reasonable to expect that for fixed n and k, the In{l\k) are disjoint intervals. In this case, 
the policy can be specified by a nondecreasing sequence of control limits 

io*{k, n)<ii*{k, n), ■ ■ ■, <is*{k, n)<it+i(k, n) 

with the interpretation that if there are k servers open with i customers present at a decision 
epoch with n periods remaining and i*{k, n)<i<^i*+i{k, n), then change the number of servers 

open to I. In this case we define the policy tt by 

(1) ir{n, i, k) = l if i*{k, n)<i<^i*+i{k, n) 

i . . . 

* The limits i*i{k, n) can be obtained recursively from the set functions l„il\k) as follows: set 

^ it+iik,n) = + CO ; 

and for j=s, s— 1, . . ., 0, set 

ij*(k, n)=i*+i{k, n) iil„(j\k)=<f) 

and otherwise set i*(k, n)— min {i: ielnij\k}. 
A similar policy was suggested by Magazine [4]. 

In the next section of the paper we show that an opitmal policy of the above control limit 
type exists. 

The Control Limit Form of the Optimal Policy 

Let Z„ and Z„ respectively denote the number of customers in the system and the number of 
servers on at the decision epoch with n periods remaining. It is convenient to model the {Z„, Z„} 
process in the following way. Define 

-X^n={Tnl) ''■„2) 7-„3, . . . ,' 0„i, iS'„2, Sn3, ■ ■ ■,} 

■0 be a sequence of independent random variables with each r having the exponential interarrival 
listribution and each S having the exponential service time distribution. 



k 



r=max 



[t:i:r„,<l]^ 



130 C. C. HTJANG, S. L. BRUMELLE, K. SAWAKI AND I. VERTINSKY 

then r customers will arrive during period n at times 

measured from the beginning of the period. The services times are S„i, S„2, ■ ■ ■, S„r for the r 
customers who arrive during the period and times 

Sn,r+1, Sn,r+2, • • • , <Sn r+Z, 

for the Z„ customers in the system at the beginning of the period. 

We also assume that { X„ : n= 1 , 2, 3, • • •, } is a sequence of independent vectors. Then Z„_i is a 
function of Z„, Z„, and Xn for a given policy v, i.e. for each policy tt, there is a function f such that 
Z„_i=/(Z„, L„, Z„)ri=l,2, ...,. 

We assume that during period n, customers have priority in the order that the service times are 
listed in the vector X„. Thus the customers who arrive during the period have priority over those 
customers present at the beginning of the period. The priorities are preemptive, and customers who 
are preempted resume their service at the point it was preempted. 

Thus the process representing the number of customers in the sj'^stem is modelled somewhat 
differently than usual, since we choose new service times for all customers in the system at a decision 
epoch, even if they are in service. In addition, the queue discipline is not the usual one. However, 
since the service times are exponential, the probability distribution of the stochastic process (Z„} 
will be the same as in a normal M/M/S system. That is P,j' is the same for our system as for the usual 
first-come first-serve M/M/S sj'^stem. 

Consequently, any result that depends only on Pj/ will be true in both our system and the 
usual system. In particular, all results except for Lemma 1 hold for the usual M/M/S system. 

In the following discussion, the period index n will be clear from the context and will be dropped 
for notation al convenience. 

Lst Ni(X\i) be the random variable denoting Z„_i given tLat ir(Z„, Ln)=l and Z„=i; that is 
Ni(x\i) is the number of customers left at the end of the period, given that i customers are present 
at the beginning of the period, that I servers are open during the period, and that Xn = x. 

The following discussion uses the difference notion 

A/(y) ^ y(y+Ay)-/(y) 
Ay Ly 

LEMMA 1: For any realization x of X, Nr„{x\i-^u)—Nn'{x\'^) is non-negative and nonde- 
creasing in i for m'>m and w>0. 

PROOF: For any realization, we have 

AN^{x\i+u) ^^^ ANr,.{x\i) 
Ai Ai 

are either or 1. 

To see this, note that the additional customer has the lowest priority, and so does not affect 
any of the other customers. The number left at the end of the period increases by one if the additional 
customer does not complete service, and remains the same if he does. It is equally clear that 

Nm'{x\i+\)=NUx\i)+\ 



QUEUEING SYSTEMS UNDER PERIODIC REVIEW 



131 



implies 



N,nix\i+u+l)=NMi+u) + i- 



This gives 



ANJx\i-\-u)^AN„,(x\i) 



Ai 



A% 



LEMMA 2 : Assume that 



A'K(i) 



At - 



>0. 



Then for m' >m, 



is nondecreasing in i. 
PROOF: 



For X= X, we have 



[pt^-pt;]K{j) 



[PT^-PT;]KiJ) = E[K{N^iX\i))-K{N,,,iX\i))]. 



AK{NMi)) ^ ^K{y) 
Ai Ay 

AK{y) 



> 



Ay 

AK(y) 
- Ay 



> 



AN„ix\i) 
i/=jv„(i|i) Ai 

AN^.(x\i) 

y=N„(.x\i) Ai 

AN,,,{x\i) 

V=N„'(.z\i) Ai 



AK{Nr..{x\i)) ^ 
Ai 



where the first inequality follows by Lemma 1 and the second, since Nm{x\i) >A^m'(a;K) and 

Ai' -"• 

Thus K{Nm{x\i)—K{Nn'{x\i)) is nondecreasing in i. Hence the expectation is also nondecreasing 
in i. 

THEOREM 1 : Assume that 

Then forn= 1 the optimal policy is of the control limit form given by display (1) . 
PROOF: Recall that 

4>x{i, a, k)— mm\pi{l\i, a, k). 

0<l<) 



132 C. C. HUANG, S. L. BRUMELLE, K. SAWAKI AND I. VERTINSKY 

We prove the theorem by showing that \pi(l\i, a, k)—\pi{l-\-l\i, a, k) is nondecreasing in i which 
imply that ipi {l\i, a, k) intersects yf/^{l-\-\\i, a,k) at most once and from below. 
The above difference can be written as 

[Sil\k)-S{l-V\\k)]-\-±,{P,^-P'^')Ky{j). 

;=o 

The first term is constant in i and the second term is nondecreasing in i by Lemma 2. 
LEMMA 3 : Assume that 



Then 



A%(i, a, k) 
AiAk 



>0. 



<0. 



PROOF: Let 



<t,S, a, k)=Sik^\k)+^ Pu'^K^iJ) 
;=o 

Mi, oc, k+\) = S{h\k-^\)+±, Pij'^K.iJ) 
<i>S + l,a,k) = Sih'\k) + ±,Pl:l,jKr{j) 

J = 

<{>S + l,a,ki-l)==S(h'\k+l)+±,PlUjK,{j) 

7 = 

First we show iheitk2>ki and k2'>ki'. Since <t)i(i, a, k)<\l/iil\i, a, k) for each/, we havo 

S{h\k)-S(l\k)<± iP,,^-P,,'^)Kr(j). 
i=o 

li l<ki, then by assumption (ii) of the cost structure 

S{h\k+l)-S(l\k+l)<Sih\k)-Sil\k) 

Consequently, 

S(k^k-\-l)-Sil\k+l)<±iP,/-P,j'^)K,(j). 

Regrouping terms provides 

x(^i(h\i,a, k+l)<Ml\i,a,k-\-\) ior l<h. 

This expression demonstrates that k2>ki. A similar argument shows that ^2' >^i'. 
From Theorem 1 we also know that ki'>ki and k2'>k2. 
The proof proceeds by fixing k and considering three cases. 
1. Suppose i>i*k+i{k). By definition of </)i(i, a, k) we have 

S{h\k)+±, Pu'''Kr{j)<S{l\k) + ± P,/K,(j). 

;=0 ;=0 



QUEUEING SYSTEMS UNDER PERIODIC REVIEW 133 

From Theorem 1, we know that^i >/:+!. So the above inequality implies 

;=0 j=0 

for I'^ki, and thus k2<ki. However, we showed earlier that k2>ki. Therefore ki = k2. In this case, 

^<f>S, J, k)=S(h\k+\)-S{h\k) 
and 

2. Suppose i<Ci*k+i (^4- 1) • In this case a similar argument shows that again ki = k2 and 

^ci>,(i, j, k)=S(k2\k+l)-Sik2\k). 
Thus 

^</..(t,i,A:)=0 

3. Suppose i*ic+i{k-j-l)<i<Ci*!c+iik). From Theorem 1, we have that ki<k, k\<k, ^2>^, 
and k2^k. Combining these inequalities with those verified at the first part of the proof provides 
k,<k,'<k+\<ik2<k2'. 

Since k2>ki' 

<l>S,a,k)-cl>,ii,a,k+l)=S{k,\k)+j:P,j'^K,{j)-S(k2\k+l)-j:P,j'^K,{j) 

j=0 ;=0 

<S{h'\k) + ±,P'i';K^(j)-S{k2\k+\)-±,P,j'^K,{j)<S{kr'\k) + ±P'iUjK,{j)-S{k2\k+l) 

j=0 ;=0 ;=0 

-i:PiU.jK^U)<Sih'\k)-h±PtUjK,{j)-S{k2'\k+l)-±P^UjK,ij) 

j=0 7=0 ;=0 

= <)!)i(i+l, a, k)—<l>i{i + l, a, k+1), 
where the middle inequality follows from Lemma 2. Thus 

for i*k+iik+l)<i<Ci*k+i(k), which concludes the proof. 
LEMMA 4 : Assume that 

for n=0, 1 and all k=0, 1, • • •, s. Then for 

m'>m, S {P,nK2U)+cx4>i(j, «, m)]-P,r [^2(i) + a0i(i,a, m')]} 
;=o 

s nondecreasing in i. 



134 C. C. HUANG, S. L. BRUMELLE, K. SAWAKI AND I, VERTINSKY 

PROOF: 

i: {Pir[K2{j)+acf>,(j,a, m)]-P,r[K2U)+aMJ,a, m')]} 

=E{[K2iN„{X\i))+a<l>,(N„(X\i),a,m)]-[K2(N„>(X\i))-\-aMNrn'(X\i),a,m')]] 
For X=x we have 

A[K2{NM\i))+ciMNJx\i). a, m)] r AK2iy) ^^ AMy, «, m) I "I ANM\i) 

Ai L ^y y=N„(.x\i) Ay \y=N„(.x\i)j Ai 

FAKiiy) I , A<t>i{y, a, m) \ "J ANr^.{x\i) 



rAKiiy) I A<^i(y, a, m) I l 

^ 1 r« 7 

L % |y = JV„(i|i) Ay \y=N^(.z\i)J 

^r AK2(y) I ^ ^ A«^i(y, g, m) I n AA^^>(a;|t) ^f 

""L ^y \y=N„'(.z\i) Ay \y=N„'{z\i)J Ai ~\_ 



At 
Ay 



y=NJ(x\i) 



A<t>^ {y, a, m ) 

+ " A 

Ay 



J 



ANJ (x\i) _A[K2{N„.{x\i)) + aMNm'(x\ i), a, m')] ^ 



y=N„'{x\i)J Al Al 



where the first inequality is by Lemma 1, the second since 

Al 

and the third by Lemma 3. 
Consequently, 

[K2(Nrr,(x\i)) + a<t>i(Nm(x\i), «, m)]-[K2iN^' (x\i)) + a^N^' {x\i) , a, m')] 

is nondecreasing in i for every realization of X. Hence its expectation is also nondecreasing in i and 
the lemma is proved. 

Thus, by using Lemma 4 and following the same arguments as in Theorem 1, we have 02 has 
an optimal control limit policy given by display (1). 
By using the above arguments recursively, the following theorem is obtained. 

THEOREM 2: Assume that 



A" 
Ai 



j2[K„+i(i) + a(i>nii, a, k)]>0 



for ^=0, 1, 2, . . ., s and that the optimal policy in period n has the control limit form given by 

display (1). Then 

is nondecreasing in i and the optimal policy in period n-^l has the control limit form given by 
display (1). 

COROLLARY: If K^ ■ . ., K^ are sufficiently convex such that 

A\K„+,{i)+a < i>^{i,a,k)] 

Ai'' - 



QUEUEENG SYSTEMS UNDER PERIODIC REVIEW 135 

for 71=0, 1, • • •, n—1 and all k=0, 1, ■ • •, s, then the optimal policies for the first n periods have 
the control limit form. 

REFERENCES 

[1] Balachandran, K. R., "Control Policies for a Single Server System," Management Science, i5, 
1013-1018 (1973). 

[2] Bell, C. E., "Characterization and Computation of Optimal Policies for Operating an M/G/1 
Queuing System with Removable Server," Operations Research, 20, 2080-2180 (1972). 

[3] Heyman, D. P., "Optimal Operating Policies for M/G/1 Queuing Systems," Operations Research, 
16, 362-382 (1968). 

[4] Magazine, M. J., "Optimal Control of Multi-channel Service Systems," Naval Research 
Logistics Quarterly 18, 177-183 (1971). 

[5] Sobel, M. J., "Optimal Average-Cost Policy for Queue with Start-up, and Shut-down Costs," 
Operations Research, 17, 145-162 (1969). 

[6] Zacks, S. and M. Yadin, "Analytic Characterization of the Optimal Control of a Queueing Sys- 
tem," Journal of AppUed Probabihty, 617-633 (1970). 



CYCLICAL JOB SEQUENCING ON MULTIPLE SETS OF IDENTICAL 

MACHINES* 



Helman I. Stern 

Ben Gurion University of the Negev 
Beersheva, Israel 

Edgardo P. Rodriguez 

World Bank 
Washington, D.C. 

Merlin L. Utter 

Proctor and Gamble 
Cincinnati, Ohio 



ABSTRACT 



The problem posed in this paper is to sequence or route n jobs, each originating 
at a particular location or machine, undergoing r— 1 operations or repairs, and 
terminating at the location or machine from which it originated. The problem is 
formulated as a 0-1 integer program, with block diagonal structure, comprised of 
r assignment subproblems; and a joint set of constraints to insure cyclical squences. 
To obtain integer results the solutions to each subproblem are ranked as required 
and combinations thereof are implicitly enumerated. The procedure may be 
terminated at any step to obtain an approximate solution. Some limited computa- 
tional results are presented. 



INTRODUCTION 

Much attention has been devoted to the classical n job-m machine shop scheduling problem. 
In most investigations each job is given a prespecified technological ordering. Less attention has 
been given to the problem of job processing options. For example, a job may be processed on machine 
A or machine B but not both. The problem posed in this paper is to sequence or route n jobs, each 
originating at a particular location or machine, undergoing r-1 ordered operations or repairs, in 
which the s"" operation may be performed on only one of n similiar machines; after which each job 
terminates at the machine from which it originated. The problem may also be visualized as the 



♦This research was partially supported by National Science Foundation grant No. GK-27836. 

137 



\ ■ 



138 



H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 



routing of a fleet of n vehicles or ships with each vehicle loading a commodity of type 1 at its origin 
location, delivering this commodity to any city in group 1, reloading a commodity of type 2 to be 
delivered to any city in group 2, etc. ; such that each city in a group is visited exactly once. On the 
final leg of the journey each vehicle either returns empty or discharges a commodity of type r at its 
origin location. The problem is also a special case of the m traveling salesman problem requiring 
m disjoint closed tours subject to a visitation constraint where each salesman must visit one city 
from each group in a specified order. 

To give the problem concreteness let there be n jobs and r different types of machines with n 
identical machines of each type 

(s=l, 2, . . ., r). 

Let the i'* machine of tj^pe s be represented as mi(S). Each job k is to initiate and terminate its 
sequence on the same machine of type 1, say machine mt(l). Job k must be processed on exactly one 
machine of each of the machine types 2, 3, . . ., r, in increasing order. No machine may process 
more than one job. Thus, the k'" job has the technological ordering 



{[wt(l)], [mi(2)orm2(2) 



or m„(2)], . . ., [mi(s) or ^2(5) ... or m„(s)], . . ., 

[mi(r) or m-aCr). 



or m„(r)], [m*(l)]| 



A feasible solution to this problem must have each machine assigned to one and only one job. An 
example of a feasible solution for 3 jobs and 4 machine types is shown in Figure 1, and consists of ^ 
disjoint cycles each comprised of four arcs. 

For this particular problem there are (3!) ^=216 feasible solutions. In general for an n job — r 
machine type problem there are {n])^~^ feasible solutions. If the cost of transfering a job of any kind 
from machine mi(s) to my(s-f 1) is given as Cij{s) the problem becomes one of finding an efficient 
algorithm that searches the set of feasible solutions and selects that solution (or solutions) that 
minimizes the total cost of sequencing all jobs. It is assumed that the processing costs on any ma- 
chine of a given type are job independent. Thus, if a processing cost of Ci(s) represents the cost to 



JOB 



JOB 2 



JOB 3 




MACHINE 


MACHINE 


MACHINE 


MACHINE 


TYPE 1 


TYPE 2 


TYPE 3 


TYPE 4 



Figure 1. — Cyclical sequencing of three jobs through four sets of machines. 



CYCLICAL JOB SEQUENCING 



139 



process any job on machine mi(s), it may be added to the transfer cost and included in Cij{s) without 
loss of generality. The problem is formulated as a 0-1 integer program, with block diagonal structure 
comprised of r assignment subproblems, and a joint set of constraints that insure closed tours. Since 
all variables are 0-1, the traditional Dantzig- Wolfe decomposition scheme is precluded. To insure 
integer results the solutions to each assignment subproblem are ranked as required and combina- 
tions thereof are implicitly enumerated in a branch and bound scheme. The procedure may be 
terminated at any step to obtain a feasible solution along with bounds on its accuracy. Some limited 
computational results are given in the last section. 

Related work has appeared in the literature starting with the truck dispatching problem of 
Dantzig and Ramser [1], Clark and Wright [2], and Newton and Thomas [3], in which all n tours 
originate and terminate at a single city with the number of cities visited by each tour implicitly 
conditional on the order of visits (due to demand variations at each city). Krolak, Felts, and Nelson 
[4] report on the work of Newton and Thomas in which multi-origin, single destination routes are 
scheduled. Bellmore, Liebman, and Marks [5] consider a multi-origin problem in which each tour 
is open but with ordered node set visitations. Svestka and Huckfield [6] solve an n closed tour prob- 
lem through m nodes such that n sorties of any length commence and terminate at a single node. 
Srivastava, Kumar, Garg, and Sen [7] consider a single closed tour visiting r cities, one from each of r 
ordered sets of cities, such that each set has a cardinality of at least one. 

MATHEMATICAL FORMULATION 

The problem may be formulated as a 0-1 minimum cost network flow problem with closed tour 
side conditions. The underlying network is depicted in Figure 2 where the node set Ns represents 
the set of n machines of type s. The node sets are ordered from left to right with A^i repeated and 
designated as N^+i as a visual aid. The only arcs in the network are those emanating from each 
set of nodes to the next higher set of nodes. The arcs from A'^, to A^'r+i are return arcs required to 
complete cycles. Let the machine sequence of each job start from its origin node in A^^i, proceed 
through a single node from each of the remaining ordered sets, A^2, -A^s, • • •, A^r, and end at its 
origin node repeated in Nr+i- In Figure 2 the only arcs shown are those possible for a cyclical 



1 
o 


k 


1 


• 


\^^ 


^. 


• 


y/^ 


• 


« 


y^^ 


' 


.V 


ki 


1 


•% 


k 






\. kn 


] 


• 


^s. 


• 


« 


^"v^ 




n 




Si." 


O 




'So 


V ,^ 




. 




^■Hl ("^^ 



r7\ 




o 


% 


k 


o 

• 


j 

o— 


k ^\ 

X.. ^ 

Jk 


^- 


k 


V 




n 
O 



X(l) X(2) X(s) X(s+1) X(r) 

Figure 2. — Network representation of problem (only flows for job k are shown). 



>- 



140 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 

machine sequence for job k. The variables, Xf/'is), represent the amount of job k flowing from 
node i in Ns to node j in Ns+i. The set of all flows from Ns to Ns+i is represented as X(s). To in- 
sure full job assignments to machines let Xij^{s) be restricted to a unit value if the A:"* job is se- 
quenced on machine j in A^^+i directly after machine % in N^. Otherwise, the flow will be restricted 
to zero. Total flows through each node in th network will be restricted to unity so that exactly 
one job will be sequenced through each machine. In addition, only the k'-^ job may flow through 
node k of Ni. Thus, one may drop the superscripts from X{1) and X{r) and reduce the number 
of flow variables to those described by yui (1) and yik (r). (This is an arbitrary stipulation, as any 
one to one correspondence between the n jobs and n machines or origins in A^i will suffice.) This 
final restriction insures that the k"' job originates and terminates its sequence at node k of sets 
A^i and A^r+i- In addition, define 0-1 variables yais) for s=2, 3, . . ., r—1 which represent the 
joint flows from each arc (i, j) in {N„Ns+i). Let Cij(s) represent the cost (distance) of any job 
transferred from i in N^ to j in Ns+i, and Z represent the total cost incurred by all job transfers. 
Then the problem of minimizing the total job sequencing cost may be formulated as a minimum 
cost flow problem with 0-1 integer flows and closed tour side constraints. The mathematical for- 
mulation of this problem is shown below as problem (P). 
Integer Program (P) 

MinZ= S c,,(l)y.,(l)+X: Z) Ctj{s)y,j{s)+ Z) Cj,(r)yj,{r) 

(k,i)t{NiNi) »=2 (i,j)e(W„JV.+,) U. k)e{Nr, Nr*,) 

Subject to: 



(1) S 2/o(«) = l, i^Ns 

(2) j:yij(s) = l,jeNs+i 

jtN, 



S = l, 



(3) yUl)-i: x,/(2)=0, (k, i)e(N„ N2) 

(4) yi^i2)-t: x,/(2)=0, a, j)eiN2, N,) 

k = l 



(5) i:x,/(s-i)-s xjm=o,jeN, 

i(N,-i ifN,-n 

(6) yijis)- i: x,,'{s)=0, a, j)eiNs, N,+^\ 

k = l 



8=3, . . .,r—l 



(7) yAr)- S x,/(r-l)=0, {j, k)e{Nr, iV.+i) 

UNr-i 

(8) y,j{s)=0,l;ii,j)e{Ns,N,+,) 

s = l, . ., r 

(9) x,j'(s)=0,l;ii,j)e{N„N,+,) 

s=2, . .., r-1 
k—1, . . .,n 

Equations (1), (2), and (8) represent r independent assignment problem constraints insuring 
that each node has unit flow. Together they provide n disjoint job sequences from nodes in Ni tc 
nodes in N^+i not necessarily closed. The interstage coupling constraints (3), (4), (5), (6), (7), anc 



CYCLICAL JOB SEQUENCING 141 

(9) provide the conditions necessary for all tours to be closed (cyclical sequences) . Upon rearrange- 
ment, the entire set of constraints can be shown to exhibit a block diagonal structure. 

DECOMPOSITION OF (P) 

Let (B) represent the set of closed tour coupling constraints. Removal of (B) from (P) allows 
the remaining portion of (P) to be decoupled into r independent assignment problems. Let (.^5) 
represent the s"" stage assignment problem from machines in A^'j to machines in Ns+i. Thus, (A,) 
is of the form. 

Minimize 

ii,j)t(.N.,N.+,) 

Subject to : 

UN. 

yij(s)=0, 1; {i, j)e{Ns, Ns+0 

Let (A) represent the union of all r assignment problems with a feasible solution defined by 
[Z, Y] such that; 

r 

Z=2-i Zs, Y=[Y], . . ., Ys, . . ., Yf], 
» = i 

where Ys is a feasible solution to (A,) and Zj is its associated cost. 

Note that an optimal solution [Z°, F°] to (A) provides a lower bound to (P). If, in addition, 
there exists a solution [X, Y"] to (B), then [Z°, Y", X] is also optimal for (P). Otherwise, one may 
enumerate all feasible solutions to (A) and select, among those that also provide a feasible solution 
to (B), the one with the lowest objective value. Such a process, however, requires the computation 
of (n!)'' solutions to (A) and an equal number of feasibility tests in (B).'\ A problem with n=7 
and r=3 requires the determination 6.4 X 10" feasible solutions to (A) . Hence, an implicit enumeration 
branch and bound procedure is proposed. This procedure requires an efficient technique for de- 
termining if proposed solutions to (A) satisfy the side conditions in (B), as well as a method of 
constructing upper bounds on (P). 

PROPOSITION 1 : Testing Y for Feasible Closed Tours 

If (Z*, Y*) is a feasible solution to (A), then (B) has either a unique solution [X*, Y*] and Z* 
provides an upper bound for (P), or there does not exist a solution to (B) corresponding to Y*. 

This result follows from the staircase structure of the constraint set (B) where (2) and (4) 
link stages 1 and 2, (5) and (6) link stages s and s+1, for s=2, • • •, r— 1, and (7) hnks stages 
r— 1 and r. Since each assignment solution provides exactly n variables with unit values and 
n(n—l) variables at zero, the Fi*, • • • Yr-i determine uniquely Xi*, ■ ■ ■ Xr-u If Yr*, Xr-i also 
satisfy (7), Z* provides an upper bound for (P). Otherswie, there does not exist a solution to (P) 
corresponding to Y*. Intuitively, one is projecting the paths of all n jobs, stagewise, from their 
origin nodes in A^i to their terminating nodes in Nr+i. Constraint (7) provides the test to determine 
if the k'" job originates and terminates at the k'" node in Ni. In lieu of sequentially solving the 
set of coupling equations for the a;f/(s) one may incorporate a composite function on the subscripts 

tNote that all feasible solutions to the assignment problem are basic solutions. 



142 



H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 



of the Y* solution for an easily computerized closed tour test. Moreover, if Y* does not provide 
a closed tour solution, fixing Y* for any r— 1 stages yields a forced feasible solution for the remaining 
r'" stage providing a feasible upper bound on (P) . This observation yields the following proposition : 

PROPOSITION 2 : Construction of Revised Feasible Solution 

Given (Z*, Y*), a solution to (A), which does not provide a feasible solution for (B). One may 
obtain a feasible solution to (B) to by constructing a revised solution 



Y*'=[Y,* 



., Y*, . . . Y*] 



where Y*s is the unique s**" stage solution to (A,) which allows the construction of a feasible solution 
to (B) (a revised solution) and provides a feasible upper bound of Z*' on (P). By computing a set of 
r revised upper bounds Z*\ s= 1, . . ., r, for (P) the tightest upper bound, Z*, may be determined 
as, _ 

Z*=Min Z*' 

s=l, . . ., r 

Example of Revised Solutions 

Consider a solution shown pictorially in Figure 3a for a three machine problem with three jobs. 
This solution is feasible for (A) but infeasible for (B). The three uniquely determined revised solu- 
tions (feasible for (B)) are shown in Figures 3b, 3c and 3d. 




•►o "-O 



0. GIVEN SOLUTION, Y 




C REVISED SOLUTION, Y^ 




o »-o —c »-o 



b. REVISED SOLUTION, Y 



^-^ 


4'' 


x 


/\ 

/ \ 



d. REVISED SOLUTION, Y' 



Figure 3. — Example of revised solutions. 



CYCLICAL JOB SEQUENCING ' 143 

TERMINOLOGY 

The results of propositions 1 and 2 provide the constructive basis for determining solutions to 
(A) and upper bounds on (P). To facilitate the desire to implicitly enumerate all potential solutions 
of (A) the following definitions are introduced : 

1. The Ranked Assignment, Yst'- 

The assignments to (As) may be ordered by non-decreasihg values of Z„ such that Y„ rep- 
resents the i" ranked assignment and Z,, its associated cost. The ranking index t=l, . . ., n\ is 

selected by the rule, 

¥a, ftefl, . . ., n!} 3a>6, Zja^Zjft 

2. A Node, r*: 

Any feasible solution to (A) say, the one identified as the k^ solution, may be represented as a 
node, y*, where, 

F*={n ,Fj,.,...,7?a 

3M{1, . . . ,n!},s = l, . . . ,r. 

Where clear, a node may be represented as an ordered set of assignments, each assignment 
represented by the integer equal to its rank, i.e., 

(ti, . . ., <j, . . . tr) 

3. Set of Solutions : 

Let P„j, . . ., Us, . . . Ur be the set of all solutions to (A) less the (u,— l) best solutions for 
(A,), 5=1, . . ., r, i.e., 

P„„ ...,«„... Ur={{t,, . . . ,t„ . . . , tr)/t,'^u„ s=l, . . . r}. 

Thus, Pi 1 , is the set of all feasible solutions to (A). (When clear we shall refer to the 

set of all feasible solutions as P.) 

4. Cost of Node Y\ Z*: 

The sum of the costs of each of the assignments in F* is 

»=1 r 

5. A Feasible Node, Y*: 

A node, F*, shall henceforth be called feasible if and only if it provides a feasible solution to 
(P). (This may be determined by the results of Proposition 1.) 

6. A Revised Node, F*' : 

A revised node, F**, is the node obtained from an infeasible node F* by replacing the s'* as- 
signment F*,^ by the unique assignment FJu^ such that the new node becomes feasible: {Us9^ts) 

7. Cost of Revised Node F*^ Z**: 

The cost associated with the revised node F** is 

Z*'=Z*-Z*,.-f-ZL. 

8. Upper Bound Associated With Node F*, Z*: 

An upper bound on (P) associated with node F* may be determined as Z* if F* is feasible, or 
from its set of revised nodes. (See Proposition 2.) 



144 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 

If F* is infeasible 



I.e., 



Z*= Min Z*» 

8 = 1 r 

_ [ Z*, if F* is feasible 

■^*= Min Z*S if F* is infeasible 

[» = 1 r 

9. Dominance Between Nodes: 

The node (ui, . . .,u„ . . . Ur) is said to be dominated by the node (vi, . . ., Vg, 

Us^Vs, s=l, . . . r. 

It follows from 1 and 4 above that if node F* is dominated by node F' then 

In Figure 4 all nodes below the dotted line are dominated by the node (1,2, 1). 



.Vr)if 



IMPLICIT ENUMERATION OF SOLUTIONS THROUGH BRANCH AND BOUND OF SETS 
OF RANKED ASSIGNMENTS 

In this section the general rational for searching the solution set P is presented in terms of a 
branch and bound algorithm. The algorithm is initiated by generating the node F' (1, . . .,1, . . ., 
1) and testing it for feasibility using the results of Proposition If. If F' is feasible then it represents 
an optimal solution to (P) with optimal cost Z'. Otherwise, Z' offers a least lower bound to (P), 
in which case an upper bound may be constructed from F^ by employing the result of Proposition 2. 
The scheme for generation of future nodes shall incorporate the assignment ranking procedure of 
K.G. Murty [8]. 

It is useful to present the scheme in terms of the tree diagram shown in Figure 4 (for a three 
stage problem) with each node represented as a point and arrows between points representing 
branching from a predecessor node to a successor node. 

(I.I, I) 



(2,1,1) 



(1.1,2) 




(1.1,3) 



Figure 4. — Illustrative tree for a three stage problem. 



tAccording to Definition 2 this node represents the first ranked solutions of all assignment subproblems. 



CYCLICAL JOB SEQUENCING 145 

Branches in the tree are limited to those from a node of the form (^i, . . ., ts, . . . tr) to suc- 
cessor nodes of the form 

(^1+1, . . .,U, . . . tr), . . ., (tl, . . ., ts+1, . . ., tr), . . ., (tu . . .,ts, . . . tr-{-l). 

Any of the successor nodes, say the s'" one, may easily be determined from its predecessor through 
the construction of the next best ranked assignment Y,,t,+i for s'" stage assignment problem. To 
reduce the information storage requirements the ranked assignments are exhausted one stage at a 
time (to be clarified subsequently). 

Associated with each node in the tree, say the k'", are its cost Z* and an upper bound Z*. If 
the node is feasible then Z* provides an upper bound and is set equal to Z*, otherwise an associated 
upper bound Z* is determined through construction of revised nodes (See Definition 8). 

The best solution determined for a tree with k nodes may be defined as the least upper bound 

Z„*= Min. Z' 

1=1 k 

In the ^+1 8t step the best upper bound may be updated by the rule, 

'Z„*, if Z*+i^Z„* 



7* 



7fc + l. 



Z*+i, if Z'+'<Z.. 



k 



To check for the existence of dominated nodes at step k-\-l (the dominance test) compare the 
cost, Z*+^ associated with the node F*+^ (feasible or infeasible), with Z/+^ If Z*+^^Z„*+' the set 
of nodes dominated by F*"*"^ may be removed from the tree as they will exhibit costs not less than 
those of the best solution found thus far. As more nodes are examined, the tree is reduced in size 
until a point is reached when all nodes in the reduced tree have been examined. If this point is 
reached at step q, then Z„' is the optimal cost. This is true since Z„* represents the least upper 
bound on all nodes explicitly examined, and is a better solution than those nodes implicitly ex- 
amined through placement in the dominated set. The algorithm will terminate in a finite number of 
steps since the number of nodes in the initial tree is (n!), and unexamined nodes in the tree are 
evaluated at each step with at least one being removed. This may be shown in set theoretic notation 
as follows : 
Let T*=the set of nodes in the tree at step k (reduced tree). 

£'*=set of nodes in T" examined in previous steps. 

£'*=set of nodes not yet examined in T*. 

Z>*=set of nodes dominated by nodes in £"*. 

Thus, at step k of the algorithm the set of nodes P may be partitioned into three noninter- 
secting sets, i.e.. 

On the k+l st iteration a node F*+' is selected from E" and placed in E". If F*"*"^ determines 
a set of dominated nodes those dominated nodes not in £■*, say J9(F*"'"^), will also be deleted from 
E" and added to D\ 



146 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 

Thus, 

l£:*+i|=l£:*l+i 
l^*+j|=l^*l-i-|D(r*+»)| 

|7?*+i|=|Z>*|+|D(F*+')| 
and 

|£'*+'|<|£'*|. Since \E?\=P and P is finite, termination occurs at some point k=q^E''=0. 
A sketch of the algorithm follows 

THE ALGORITHM 
Initialization : 

Let A:=l, j^*=P, £:*=Z>*=0, Z„*=co 
Step 1 : Select a node F* from E". If 5*=0, 

STOP: OPTIMAL SOLUTION IS Z„*. 
Step 2: Compute Cost of F* 



Z — ^ Zj, 

*=1, ...r ' 



Step 3 : Feasibility check 

Let 7*= 



1 , if F* is feasible 
0, if F* is infeasible 



Step 4 : Upper bound associated with F* 

[Z*,if7*=l 

I Min 

Step 5 : Update best solution 



'^*— ^ Min Z**, if 7*=0 



^, J Z„*, if Z*^Z„» 
" 1 Z*, if Z*<Z„* 

Step 6 : Determine dominated nodes at F* 

DrF*W J^"- - '. . 'o if T*=l or Z* ^ Z/ 
^ ^ 10, if7*=0 

where Pt,,.t....tr..., is the set of nodes dominated by the node ¥"={¥^1^ . . . F^, . . . F*,,} 
Step 7 : Update Solution Sets 

£:*+»= £:*+F* 

E*+'=^-F*-I>(F*) 

Z>*+i=D*+I>(F*) 
Return to Step 1 

The selection of a node from jE* deserves further explanation. In the actual programming of 
the algorithm (see [9]) the set operations are performed implicitly, and hence no bookkeeping 



CYCLICAL JOB SEQUENCING 



147 



requirements are necessary to record the set element transfer operations in Step 7. Moreover, the 
elements in E'' are generated as selected. This selection decision is designed to fulfill two objectives, 
(i) reduce storage in the assignment ranking routine, and (ii) accelerate the generation of dominated 
nodes through the exploitation of all computed assignments thus far. 

The selection scheme may be described with the aid of the list of ranked assignments shown 
in Table 1. 

Table 1. — List oj Ranked Assignments 



Ranked 


Subproblem 


assignment 


Ax 




A, 


• ' • 


Ar 



1 

2 


Yn 
Y,^ 


. . . 


F,2 




Yn 

Yr2 




Yxu 


. . . 


r«. 


. . . 


Yrt. 


n! 


Fi„, 


. . . 


Y sn\ 


• • • 


Yrnt 



A node Y" is an ordered set of elements selected from Table 1 in such a manner that one element 
is selected from each column in the list and ordered column wise. Hence, the list represents the 
basic data for generating all (nl)' solutions in P. Any column in the list may be generated using 
Murty's assignment ranking technique. This technique requires a considerable amount of stored 
data from the t—l first assignments to determine the i"" ranked assignment. Thus, it is efficient 
to exhaust the generation of assignments one column at a time. 

One proceeds from the origin node (1, • • •, 1) • • • 1) and generates successive nodes by ranking 
the assignments of the first stage assignment problem until it is determined that a node using the 
next assignment in the first column of the list is feasible or provides inferior solutions by the law 
of dominance. (See Step 6.) This is illustrated in Figure 5 as path 1 for a three stage example. 
One then backtracks and starts to rank the assignments of the second stage. The third stage assign- 
ments are not started until all undominated nodes involving combinations of already ranked 
solutions in stage 1 are examined. These combinations for the example illustrated in Figure 4 are 
shown in paths 2 and 3. One continues in this fashion until all stages have been ranked. The details 
of exhausting all combinations of previously generated assignments may be found in reference 
[9] which is available from the authors upon request. The efficiency of this scheme is demonstrated 
in the following three stage example where only 9 out of a possible 216 nodes are explicitly examined. 
Moreover, the scheme may be terminated at any step to obtain an approximate feasible solution 
and a bound on the cost error of this solution through comparison with the lower bound found in 
the first step. 



148 



H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 
1 



^ 'V^ (',1,2,1....1) 




(1,2,2,1,.., I) 



/ 



Figure 5. — A tree to illustrate the sequence of paths followed during the branching scheme. 

EXAMPLE 

The cost matrices for this three job-three machine stage example and the associated list of 
ranked assignments are shown in Tables 2 and 3, respectively. 



Table 2. — Cost Matrices for Three Stage Example 
Stage 1 Stage 2 Stage 3 

"3 5 9 
10 1 

2 8 5 



4 


6 


2 " 




1 


8 


10 




5 


2 


5 _ 





■4 


8 


~ 


2 





2 


7 





_ 



Table 3. — Ranked Assignments j or Three Stage Example 





Stage 1 






Stage 2 






Stage 3 




Rank 


Assignment 


Cost 


Rank 


Assignment 


Cost 


Rank 


Assignment 


Cost 


1 


312 


5 


1 


231 


8 


1 


312 


2 


2 


213 


12 


2 


123 


8 


2 


123 


4 


3 


321 


15 


3 


321 


11 


3 


132 


6 


4 


132 


16 


4 


132 


12 


4 


321 


7 


5 


123 


17 


5 


213 


20 


5 


213 


10 


6 


231 


21 


6 


312 


27 


6 


231 


17 



The assignments in Table 3 are represented as a permutation of the sequence (1, 2, 3). The 
computational and branching results for each step of the algorithm are shown in Table 4 
and Figure 6, respectively. 



CYCLICAL JOB SEQUENCING ' 149 

Table 4. — Nodal Information at Each Step of the Algorithm for Three Stage Example 









Upper 


Min 




Set of 


Step 


Node 


Cost 


bound 


upper 
bound 


Feasibility 


dominated 
nodes 


k 


F* 


Z* 


1} 


z„* 


7* 


D(F*) 


1 


(1, 1, 1) 


15 


23 


23 








2 


(2, 1, 1) 


22 


27 


23 








3 


(3, 1, 1) 


25 


25 


23 





^3, I, 1 


4 


(1, 2, 1) 


15 


31 


23 








5 


(2, 2, 1) 


24 


25 


23 





P21 2, 1 


6 


(1,3, 1) 


18 


22 


22 








7 


(1,4, 1) 


19 


23 


22 








8 


(1,5, 1) 


27 


22 


22 





^1, 5, 1 


9 


(1, 1, 2) 


17 


17 


17 


1 


•f* 1, 1, 2 




Z^ FEASIBLE 

(OPTIMAL SOLUTION) 



Z8>Z^8 



Figure 6. — Tree illustration of three stage example. 



150 H. I. STERN, E. P. RODRIGUEZ AND M. L. UTTER 

The optimal cyclical job sequences are shown in Figure 7. 




Figure 7. — Optimal cyclical job sequences for three stage example. 

It is noted that of the 6 ranked assignments for each stage only the first 3, 5 and 2 ranked 
assignments for stages 1, 2 and 3, respectivel}^ had to be computed in the example. 

A computer program has been written in Fortran IV for the IBM 360/50 to apply the algorithm 
for r equal three [9]. Eight sample problems were solved using this code with execution times 
shown in Table 5. The reader should be cautioned that these times are expected to increase as 
additional stages are added beyond three, although no computational evidence has been accumu- 
lated as yet. 

Table 5. — Computer Results j or Sample Problems (r—S) 



Problem number 


Problem size (n) 


Execution time (seconds) 


1 


3 


8.5 


2 


4 


8.8 


3 


4 


9.5 


4 


5 


8.7 


5 


5 


14.5 


6 


7 


12.3 


7 


7 


21.2 


8 


10 


36. 1 



ACKNOWLEDGMENTS 

The authors wish to give their appreciation to Dr. Richard Francis of Ohio State University 
for securing permission to use the assignment ranking computer program referenced in [10]. 



REFERENCES 

[1] Dantzig, G. B. and J. H. Ramser, "The Truck Dispatching Problem," Management Science 

6, 80-91 (1959). 
[2] Clarke, G. and J. W. Wright, "Scheduling of Vehicles from a Central Depot to a Number of 

Delivery Points," Operations Research, 12, 568-81 (1964). 



CYCLICAL JOB SEQUENCING ' 151 

[3] Newton, R. M. and W. H. Thomas, "Design of School Bus Routes by Computer," Socio- 
Economic Planning Sciences, 3, 75-85 (1969). 

[4] Kj-olak, P., W. Felts and J. Nelson, "A Man-Machine Approach Toward Solving the General- 
ized Truck-Dispatching Problem," Transportation Science, 6, 149-170 (1972). 

[5] Bellmore, M., J. C. Liebman and D. H. Marks, "An Extension of the (Szwarc) Truck Assign- 
ment Problem," Naval Research Logistics Quarterly, 19, 91-99 (1972). 

[6] Svestka, J. A. and V. E. Huckfeldt, "Computational Experience with an M-Salesman Traveling 
Salesman Algorithm" Management Science, 7, 790-799 (1973). 

[7] Srivastava, S. S., S. Kumar, R. C. Garg and P. Sen, "Generalized Traveling Salesman Problem 
Through n Sets of Nodes," Canadian Operational Research Society, 7, 97-101 (1969). 

[8] Murty, K. G., "An Algorithm for Ranking All the Assignments in Order of Increasing Costs," 
Operations Research, 16, 682-687 (1968). 

[9] Stem, H. I., P. Rodriguez and M. Ij. Utter, "The M-Traveling Salesman Problem with 
Ordered Visits," Operations Research and Statistics Paper No. 37-71-P5, Rensselaer 
Polytechnic Institute, Troy, New York (1971). 
[10] Fluharty, E., "Solving the Quadratic Assignment Problem by Ranking Assignments," Masters 
Thesis, Ohio State University (1969). 



JOHNSON'S APPROXIMATE METHOD FOR THE 3 X m 
JOB SHOP PROBLEM 



Wlodzimierz Szwarc and George K. Hutchinson 

School of Business Administration 

University of Wisconsin- Milwaukee 

Milwaukee, Wisconsin 



ABSTRACT 

The effectiveness of Johnson's Approximate Method (JAM) for the 3 X w job 
shop scheduling problems was examined on 1,500 test cases with n ranging from 6 
to 50 and with the processing times Ai, B„ C, (for item i on machines A, B, C) being 
uniformly and normally distributed. JAM proved to be quite effective for the case 
B, < max (A ,, C,) and optimal for S, < min (A „ C,) . 



1. INTRODUCTION 

The 3 y, n job shop problem can be defined as follows. Machines A, B, C process each of the 
n items 1,2,. . ., n in the order ^ 5 C. Given the processing times {A^, Bi, Ci for item i on machines 
A, B, C respectively) , the problem is to find a sequence minimizing the timespan. S. M. Johnson 
[2] provided a very quick method for solving the 2 X n problems which, when adopted to the 3 
X n case, reads as follows. 

Solve instead a 2 X n problem assuming the processing times for item i on the first and second 
machines to be Ai-\-Bi and Bi-\-Ci respectivel3^ The resulting sequence p is not necessarily optimal, 
however. 

R. J. Giglio and H. M. Wagner [1] tested Johnson's Approximate Method (JAM) on twenty 
3X6 trials (with processing times uniformly distributed between 1 and 30). In 9 out of 20 trials the 
optimal sequence was produced. To check whether p is optimal they had to enumerate all n! per- 
mutations (in general there is no other way) and find the respective elapsed time. 

Let T{p) and t(p) be the timespans for sequence p in the 3 X n and 2 X n {A-\-B, B-\-C) prob- 
lems respectively. W. Szwarc established in [3] sufficient optimality conditions for JAM. Sequence 
p is optimal for the 3 X ?^ problem if 

(1) T{p) = t{p)-± B, 

i = l 

REMARK: The right hand side of (1) is actually the lower bound of the minimal timespan of 
the three machine problem. 

153 



154 



W. SZWARC AND G. K. HUTCHINSON 



The efficiency of Johnson's two machine method makes it possible to test JAM (via (1)) for 
practicality any number of items. There were examined 1,500 3 X n. trials with n ranging from 6 to 50. 
The processing times were being drawn from a uniform and a normal distribution. The results 
show that JAM is quite efifective if Bi< max {At, d) for all i (see Tables 1 and 2). JAM always 
produces an optimal solution for the case Bi< min (Ai, C,) which lead us to believe that a theorem 
confirming optimality of JAM (for this case) was true. This was subsequently proven in [4]. We feel 
that JAM is actually much more effective than the results indicate. This is due to the fact that (1) 
is too strong a condition. Moreover (see Section 3) its effectiveness seems not to depend on the num- 
ber of items. 

2. COMPUTATIONAL RESULTS 

The program was written in ECSL and run on the UNIVAC 1110. There were performed 30 
separate runs, each providing three separate streams of random numbers A^, Bt, d for the 50 
3 X w trials as well as calculating T{p) and its lower bound. 

The results are summarized by Table 1. 



Table 1 



Distribution of 
Au B„ C, 


Restrictions on Bi 


Number of items 

n 


Row 


6 


10 


20 


50 


Normal m=50, <r=5 


Bi< max (Ai, C.) 


43 

48 


46 

47 


47 
43 


— 


1 


Uniform 
1-99 


Bi< max (Ai, d) 


42 
32 


46 

47 


43 
44 


49 
44 


2 


None 


18 
14 


21 
16 


24 
25 


21 
24 


3 


Bt< min (Ai, C.) 


50 
50 


50 
50 


50 
50 


50 
50 


4 



REMARKS: 1. Each cell registers the results of two independent 50 trial runs. 

2. The numbers in the cells indicate how many times per run (i.e., out of 50) was T{p) equal 
to its lower bound. 

3. The three streams generate numbers in Ai, Bi, Ci order for each i=l, . . . , n. 

However, in the case when Bi are restricted, the values of B, are interchanged whenever necessary 
with Ai or (and) C, to comply with the appropriate restriction. 

4. The run times in CPU seconds per run (50 trials) for n=6, 10, 20, and 50 are 25.69, 47.10, 
129.13, and 588.93 respectively. 



3 X W JOB SHOP PROBLEM 



155 



3. STATISTICAL VERIFICATION OF THE RESULTS 

The first analysis was to examine whether the random numbers generated in each run affect 
the results within each cell. According to the Standard Chi-Square Test based on the data in the 
first two rows (excluding 7i=50) the two results in a cell are not statistically significantly different 
on the 20% level. Moreover, there are no statistically significant differences (at the 20% level) 
between 1) the distributions, 2) the number of items n, 3) the interactions of n and the distributions. 

Applying the same technique to the data from the last three rows, we found that 1) there are 
no statistically significant differences between the interactions of n and the restrictions on Bi 
(level 20%) ; 2) as expected there is a strong statistically significant difference between the various 
restrictions on Bi (level 0.1%). 

The conclusion that the effectiveness of JAM does not depend on n seems rather surprising. 
Hence we decided to verify it on another test. 

The ratio of the actual T{p) to its lower bound (in percents) was found for each of the 800 
trials from rows 2 and 3. The entries of Tables 2 and 3 register the number of trials belonging to a 
specific ratio and number of items category. The numbers on the left hand side in Tables 2 and 3 
symbolize unit intervals. For instance 104 means a half closed interval [104, 105]. 



Ratio 

/ actual time \ 

Vlower bound/ 



Table 2. — Case J?i< max {At, d) 
Number of items 



100 



100 
101 
102 
103 
104 
105 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 



6 


10 


20 


50 


83 


97 


97 


100 


7 


1 


3 




3 


1 






1 


1 






3 

















1 



^ 






1 






























































1 









Total 



377 
11 
4 
2 
3 

1 

1 







1 



156 



Ratio 



W. SZWARC AND G. K. HUTCHINSON 

Table 3. — No restriction on 5,. 
Number of items 



/ actual time \ 
\lower bound/ 










Tnto 


6 


10 


20 


50 


L oia 


100 


35 


41 


52 


52 


180 


101 


5 


3 


3 


4 


15 


102 


4 


5 


4 


1 


14 


103 


7 


3 


4 


1 


15 


104 


5 


9 


1 


6 


21 


105 


3 


2 


5 


4 


14 


106 


4 


6 


2 


4 


16 


107 


2 


2 


3 


2 


9 


108 


1 


3 


1 


2 


7 


109 


3 





1 


4 


8 


110 


2 


4 


1 


3 


10 


111 


2 


3 





2 


7 


112 








2 


2 


4 


113 


2 


1 


1 


1 


5 


114 


3 


2 


1 


1 


7 


115 


2 


4 


3 


4 


13 


116 


1 











1 


117 


2 


2 


3 


2 


9 


118 








1 


1 


2 


119 





1 


1 





2 


120 


3 





1 


1 


5 


121 


1 


1 


3 





5 


122 








1 





1 


123 


3 


1 


1 





5 


124 


1 








1 


2 


125 








2 





2 


126 


2 





1 





3 


127 

















128 

















over 128 


7 


7 


2 


2 


18 



The distributions were compared pairwise using the Kolmogorov-Smirnov Two Sample Test. 
The null hypothesis that there were no differences could not be rejected at a 10% level of 
significance. 

The results from Table 2 show that almost all trials are within a one percent range from the 
lower bound (row 1) for ?i>10. Note also that the entries in the first rows of tables 2 and 3 happen 
to increase as n increases. 



3 X ^ JOB SHOP PROBLEM ' 157 

REFERENCES 

[1] Giglio, R. J. and H. M. Wagner, "Approximate Solutions to the Three Machine Scheduling 
Problem," Operations Research, 12, 305-324 (1964). 

[2] Johnson, S. M., "Optimal Two and Three-Stage Production Schedules with Setup Times 
Included," Naval Research Logistics Quarterly, 1, 61-68 (1954). 

[3] Szwarc, W., "Mathematical Aspects of the 3 x n Job-Shop Sequencing Problem," Naval Re- 
search Logistics Quarterly, 21, 145-153 (1974). 

[4] Szwarc, W., "Special Cases of the Flow-Shop Problem," School of Business Administration, 
University of Wisconsin-Milwaukee, Dec. 1975; to appear. Naval Research Logistics 
Quarterly, 24, No. 3, Sept. 1977. 



A CONVEX PROPERTY OF AN ORDERED FLOW SHOP 
SEQUENCING PROBLEM* 



S. S. Panwalkar 

Department of Industrial Engineering 

Texas Tech University 

Lubbock, Texas 

A. W. Khan 

American Science and Engineering Company 
Houston, Texas 



ABSTRACT 

A flow shop sequencing problem with ordered processing time matrices is con- 
sidered. A convex property for the makespan sequences of such problems is dis- 
cussed. On the basis of this property an efficient optimizing algorithm is presented. 
Although the proof of optimaJity has not been developed, several hundred problems 
were solved optimally with this procedure. 



1. INTRODUCTION 

Consider the classical n job m machine flow shop sequencing problem. Recently, Smith, et al., 
[1] have introduced a subcategory of the flow shop problem with "ordered processing times." 
Two properties are stipulated for such an "ordered flow shop problem." 

(i) If a particular job 'a' has a smaller value of processing time on any machine compared to 
another job 'b', then job 'a' will have smaller (or equal) processing time compared to job 'b' on all 
machines, and (ii) if any job has its^"* smallest processing time on any machine, then every job will 
have its J*"" smallest processing time on the same machine. 

Note that due to the characteristics described above, we can conveniently number jobs in the 
ascending order of processing times. In the following we will consider the ordered flow shop problem 
with minimum makespan criterion and permutation schedules only. 

2. NOTATION 

Let A'^ denote the set of n jobs and let the jobs be numbered in ascending order of their process- 
ing times. Job n will then be referred to as the "largest job." 



♦Partial support for this research was provided by the National Science Foundation Grant GK-2869. 

159 



160 S. S. PANWALKAR AND A. W. KHAN 

Let the set A^^ be divided into two partitions, a and <r' such that a contains at least one job 
{a' may be empty if a=N). Assume for convenience that the largest job is always included in <r. 
Let r denote thenumber of jobsin (r(l<r<n). 

3. PREVIOUS RESULTS 

Smith et al., [1, 2] have presented some interesting results which can be represented by the 
following two theorems. The statements of the theorems are modified slightly to match the above 
notation and the proofs of these theorems may be found in [1, 2]. 

Theorem 1 [1] : In the ordered flow shop problem a minimum makespan sequence is given by 
arranging jobs in descending (ascending) order of processing times if the largest processing time for 
every job occurs on the first (last) machine. 

THEOREM 2 [2] : In the ordered flow shop problem there exists a minimum makespan se- 
quence of the form aa' , where jobs in <j are arranged in ascending order of processing times followed 
by jobs in a' in descending order of processing times. 

It may be noted that the second theorem is a more general one. Thus there are 2"~^ sequences 
(for all values of r) in which an optium sequence can be found. Each sequence satisfying Theorem 
2 is said to have a "p3'ramid structure" and for a sequence with pyramid structure, the value of r 
indicates the position of the largest job. Smith et al., [2] have presented an enumeration algorithm 
(S-P-D method) to evaluate 2"~' sequences and recommend the use of a branch and bound procedure 
for further improvement. 

4. CONVEX NATURE OF THE ORDERED PROBLEM 

In order to explore some additional properties of the ordered problems (and possibly to develop 
a more efficient solution procedure), several problems were solved by the S-P-D method. For 
e9,ch problem, the makespan values of all 2"~^ sequences were analyzed. 

For a given value of r defined above, there are „_iCr-i possible sequences with a pyramid struc- 
ture. Let Sj- represent the set of these sequences. Let S*r represent the best sequence and T*r, the 
corresponding value of the makespan. If an optimal sequence for the complete problem (satisfying 
Theorem 2) occurs corresponding to r=k, it was observed that for all enumerated problems 

T\>T*,> ■ ■ ■ >TU>T\<TU,< . • • <T\ 

This characteristic will be explained with an example. Consider a 6 job 6 machine problem in Table 
L For this problem the S-P-D algorithm will generate 32 sequences. 

Table I. — Processing Time Matrix for an Ordered Flow Shop Problem 

Machine 

Job 

12 3 4 5 6 



1 


3 


13 


7 


6 


8 


9 


2 


19 


85 


42 


37 


50 


55 


3 


26 


116 


57 


51 


68 


75 


4 


29 


130 


64 


56 


76 


84 


5 


36 


161 


80 


70 


95 


104 


6 


68 


304 


150 


132 


179 


197 



ORDERED FLOW SHOP SEQUENCING 



161 



These sequences can be divided into the 6 sets as follows : 
<Si = (654321) 

<S2= (165432, 265431,365421,465321, 564321) 

S3=(126543, 136542, 146532, 156432, 236541, 246531, 256431, 346521, 356421, 456321) 
/S4= (123654, 124653, 125643, 134652, 135642, 145632, 234651, 235641, 245631, 345621) 
<S5= (123465, 123564, 124563, 134562,234561) 
^8= (123456) 
The sequence with minimum makespan in each of the above sets and the corresponding value of 
makespan are given below. 

r*i=1357 
r*2=1338 
T*3=1332 
r*4=1373 
T*5=1419 
T*8=1476 

The makespan for the best sequence in each set is plotted in Figure 1 against the position of the 
largest jobrj, in that sequence. With each change in position of the largest job starting with any end, 
the value of makespan decreases until the optimal value is reached and then it increases. 

It may be noted that although the figure represents discrete quantities, the lines drawn through 
various points represent a "convex" shape. A mathematical proof for such a property has not been 
developed. All enumerated problems, however, exhibited this property. 

A simple algorithm based on this property has been presented in the appendix. Two hundred 
problems ranging in sizes 6X15 to 14X15 were solved by the proposed method as well as a branch 
and bound method using Theorem 2 properties. All problems by the proposed method gave optimal 
results (as confirmed by the branch and bound solutions). The proposed method proved to be more 
efiicient in all cases. Finally, we would like to encourage readers interested in the problem to develop 
mathematical proofs. 



iS'*i = 654321 
S'*2= 265431 
5*3=126543 
<S*4= 123654 
<S*5= 123465 
-S*fi= 123456 



1450 



z 

CO 
UJ 

< 1400 



1350- 



1300 



2 3 4 5 

POSITION OF LARGEST JOB 



Figure 1. — Variation of minimum makespan with the position of largest job n. 



162 S. S. PANWALKAR AND A. W. KHAN 

APPENDIX 

The step by step algorithmic procedure for the proposed algorithm is as follows: 

STEP 1 : Renumber the jobs according to the ascending order of magnitudes of the processing 
times. 

STEP 2: Set i=n-{-l and set T*n+i= <». 

STEP 3: Set i=i-l. If i=0, go to Step 6. 

STEP 4: With largest job in the i*" position evaluate all sequences with pyramid structure 
for makespan (i.e., all sequences with jobs in ascending and then descending order of processing 
times). When each new sequence is generated, compare with the previous best sequence and store 
only the best sequence. When all the pyramid shaped sequences with largest job in i*^ position are 
evaluated, denote the best dequence by S*i. 

STEP 5: If T*i<T*i+i go to Step 3. 

STEP 6: The sequence S*i+i is an optimal sequence. 

I 

REFERENCES 

[1] Smith, M. L., S. S. Panwalkar, and R. A. Dudek, "Flow Shop Sequencing with ordered Proc- 
essing Times Matrices," Management Science, 21, 544-549 (1975). 

[2] Smith, M. L., S. S. Panwalkar and R. A. Dudek, "Flowshop Sequencing Problem with Ordered 
Processing Time Matrices: A General Case," Naval Research Logistics Quarterly SS, 
481-486 (1976). 






II 



A MANPOWER PLANNING/CAPITAL BUDGETING MODEL (MAPCAB) 



Rolf H. Clark 

Office of Chief of Naval Operations 
Washington, D.C. 

Robert A. Comerford 

University of Rhode Island 
Kingston, Rhode Island 



ABSTRACT 

A deterministic resource allocation model is developed to optimize defense 
effectiveness subject to budget, manpower, and risk constraints. The model consists 
of two major submodels connected by a heuristic. The first is a mathematical pro- 
gram which optimizes the multiperiod weapon mix subject to the constraint set. 
The second is a manpower supply model based on a transition matrix in which 
individual transitions are functions of personnel related budgets and historical 
transition rates. The heuristic marries the submodels through an iterative process 
leading to improved solutions. An example is provided which demonstrates how 
systems are undercosted and overprocured if manpower supply is not properly 
reflected relative to manpower demand. 



I. INTRODUCTION 
Background 

In economic decisions costs are normally assigned to an investment dependent on the inputs 
used to produce and operate it, Similarly, weapon systems are typically costed by the resources 
needed to develop, operate, and support them.* Because of the unique conditions of military 
manpower development, costing the manpower inputs as a function of manpower used is not only 
inaccurate, but actually erroneous. The error results because manpower levels required to man a set 
of weapons may require large pipeline inventories of skills not fully used, but needed to ensure 
adequate levels of more senior or more specialized skills which are used. If costs of a weapon system 
are determined by the manpower utilized directly, without accounting for the pipeline inventories, 
then the tendency will be to under-cost and over-invest in such systems. This will result in inability 
to man them adequately. 



♦As an example, the U.S. Navy's guidelines for economic analysis as stated in Secretary of the Navy Instruc- 
tion 7000.14 of 14 March 1973 specify that personnel costs are charged according to ". . . the cost of military per- 
.onnel services involved directly in the work performed." 

163 



164 R. H. CLARK AND R. A. COMERFORD 

Unaccounted for "pipeline" personnel costs apply not only to manpower waiting to be de- 
veloped into more useful skills, but also to manpower which has transitted from a junior but 
valuable skill to a more senior but overmanned skill. Retired personnel are an obvious example, 
but other senior active skills may be less real worth (in terms of opportunity costs) than a more 
junior grade. The problem of improper weapon accumulation is compounded if improper dis- 
counting techniques make the present value of future pension costs less than they should be.* 
The result is to obtain systems more manpower intensive than is efficient. 

An alternative approach avoiding the pipeline error is needed. Manpower supply must be 
properly accounted for in allocating Defense resources. In the proposed model, decision making 
in peacetime is the assumed scenario, though extensions to wartime are straightforward. The 
basic question will be what mix of weapon systems to accumulate over a long term planning horizon. 
The present value of the selected weapon configuration serves as the objective. The model accounts 
for all personnel types, including those such as recruits, trainees, and retirees, who are not specifi- 
cally productive. It also can account for all budget allocations within the annual budget. These 
allocations are called subbudgets, and parallel actual Defense expenditures. 

The model reflects the following assumptions: 

1) Defense budgets are obligated annually. They are based on national macro-economic 
policy (which, in turn, reflects national security policy), and are functions principally of GNP 
and the unemployment rate. Since they are determined by internal factors, at least in peacetime, 
defense budgets are reasonably predictable. 

2) Defense budgets are fixed for the year. Defense cannot borrow or lend funds to increase or 
diminish the annual budget. 

3) While the social cost of capital is basic to the decision society makes when it decides how 
much to allocate to Defense, it is not equivalent to the cost of capital which Defense should base its 
decisions on once it has received the funds. That defense investment and the social cost of capital 
are not closely related has been upheld empirically. During periods of high world tension, defense 
expenditures have reflected imputed discount rates in excess of 20% while the social cost of capital 
(as measured by the treasury bond rate) remained at 3% during the same period. t 

4) The military labor market is very imperfect. Capital is not a surrogate for manpower in 
the short run. Dollars cannot be converted into manpower for two reasons. First, military personnel 
are hired only as recruits, with needed skills then developed. Second, all equal grades are paid 
essentially the same wage regardless of relative worth. (The existence of specialty pay counters 
but does not neutralize this point.) 

5) Due to the unforseeable tumultuous effects of sudden changes in Defense posture, changes 
in Defense policy must occur gradually to allow the economy to adjust. Also, risk considerations, 
such as maintaining some minimum defense posture, must be accommodated — and initial conditions 
of the analysis must match current defense configurations. 



ill 
III 



♦Department of Defense Instruction 7043.3 of 18 October 1972, "Economic Analysis and Program Evaluation 
for Resource Management" specifies a 10% discount rate for all defense investment decisions. The proper rate for 
discounting social investments is discussed from a conceptual standpoint in Arrow and Kurz [1]. But as will be 
seen from the model below, an even more serious error than an improper rate is to disregard future pension costs 
altogether. This practice results from using limited planning horizons, and since pension costs are among the last to 
occur, causes manpower costs to be biased downward. 

tFor a discussion of discount rates in excess of 20% see Hitch and McKean [7, p. 211]. 



k 



MANPOWER planning/capital BUDGETING 165 

Rationale for the Manpower Planning/Capital Budgeting (MAPCAB) Model 

Model development is based on the above assumptions. Collectively these assumptions make 
the Defense situation unique. However, some capital budgeting techniques borrowed from business 
decision theory, combined with familiar optimization methodology, can clarify the Defense re- 
source allocation problem. MAPCAB combines some standard manpower planning techniques with 
capital budgeting criteria in a mathematical program to yield solutions. An iterative methodology 
used in the model makes the underlying interrelations between manpower and capital allocation 
visible to the decision maker. This facilitates coordination between the policy maker and the 
analyst when the model is parametrically exercised. 

The model has the following features: 

1) Superficially, the proposed technique for discounting is radically different from the current 
cost of capital orientation. The justification for the discount rate concept is, as usual, the relative 
advantage of current consumption over future consumption. But two questions arise. First, if 
Defense is to discount, what is the equivalent of "consumable" for defense? Second, what is the 
value of current consumables relative to future ones? Defense exists to deter, defend, or attack 
enemies. It can do that only with employable weapon units like naval task groups, ballistic missile 
submarines, bomber wings, missile defense units, etc. Resources can either be expended to have 
such weapon units now, or they can be directed toward building shipyards, conducting research 
and development, and building training facilities so that more weapon units will be available later. 
Viewed in this light, the consumables are employable weapon units. The worth of current relative 
to future weapons is obviously dependent on the tension level now versus the future. Consequently 
the objective function consists of the cumulative present value of deploy able weapon systems, 
discounted as a function of the tension level. The tension level is measured as the probability of 
engagement in hostile activity during each year of the planning horizon. (The probabilities could be 
obtained from a panel of intelligence experts through Delphi techniques.*) The objective function 
variables can be adjusted for tactical and/or physical depreciation, reliability, and relative or 
absolute payoff. 

2) The objective is maximized subject to budget and manpower constraints. Budgets and 
manpower limitations are estimated for a multiperiod planning horizon. Since future manpower 
supply is partially controlled by certain budget allocations (e.g. recruiting and training budgets), 
alterations to such budgets are made and new manpower estimates obtained. Changing manpower 
levels in turn affects budget allocations. In essence an entire new set of constraints is obtained. The 
new optimum objective is compared with the original solution. Through a process of such iterations, 
manpower supply affects weapon selection and weapon selection affects manpower supply, until a 
solution reasonably near an optimum obtains. 

3) Large shifts in defense posture could cause economic tidal waves in the private sector. Such 
iflfects are prevented in the model by constraints which restrict year-to-year changes in weapon 
evels and/or budget allocations to a predetermined percentage of the earlier year's level. Addition- 
lUy, risk is included as a constraint which places lower bounds on selected variables. 

4) Finally, the model assumes divisibility in the variables. While one could argue for integral 
/eapon units, most weapons are either sufficiently numerous to negate the value of an integer 
olution (e.g. FBM Submarines), or, in fact, are divisible (task groups). 



*For a description of Delphi see Quade, "When Quantitative Models are Inadequate" [9, pp 333-343]. 



166 R. H. CLARK AND R. A. COMERFORD 

The MAPCAB model as presented in the next section incorporates the above factors in a 
Unear program. It uses opportunity costs derived from the optimization process to make the budget 
allocation decisions which alter future manpower supply. The procedure compares normalized 
opportunity costs of the various scarce factors, and then heuristically alters manpower related 
budgets. Perceived shortages in manpower relative to capital are reduced by deducting funds from 
non-personnel related budgets and applying them to recruiting more people, or raising wages to 
retain people. Perceived shortages in skilled relative to unskilled manpower are reduced by shifting 
recruiting funds to training budgets. Budget reallocation continues so long as improvements in the 
objective function result. Eventually the opportunity costs will yield no further improvement within 
the budget flexibility allowed. 

The model is intended to assist in long term policy decisions. Directions rather than exact 
magnitudes of movements are perhaps the most relevant output of a policy model. The type of 
policy questions which the model can answer include the following: Should manpower levels be 
increased or decreased? Should a particular system be expanded or phased out? What skills will be 
in the shortest supply if a certain mix is accumulated? 

This initial model is entirely deterministic. Stochastic extensions accounting for distributions of 
annual budgets and manpower transitions are proper for follow on studies. 

II. MODEL DETAILS 

A. Notation 

A Vector A. 

A Matrix A. 

dkit Amount of budget k needed per unit weapon j year t. 

B Average value of budget dual variables for planning horizon T. (See equation (D-3).) 

Bkt Budget type k available in year t. 

Brt Recruiting budget for year t. 

B,t Training budget for j^ear t. 

B*u Budget Bic adjusted to reflect relative worth of money to manpower duals. (See 

equations D-4, D-6 and D-7).) 
B**i, Budget Bit adjusted to reflect relative worths of money to manpower duals and 

relative worths of specialized to unspecialized skills. (See equations (D-12 and 

D-13).) 
Bt Total budget allocated to Defense in year t. 

Crjet Accounting cost of skill code rgc in year t. Includes all personnel related costs such, 

as wages, incentives, housing, medical, and administrative costs. 
Creu. Tji The accounting cost of converting a person of skill code rgu into skill code rgs. 

Di The discount rate used in discounting the objective for year i. This rate is based on 

the world tension level for year i rather than on the cost of capital. 
drgct Dual variable associated with the constraint associated with the index rgct . 

d'rgct The duel variable d adjusted to dollar units. (See equation (D-14).) 

d"rgc^i The adjusted dual d' adjusted a second time for the training cost of developing skil 

code rgc. (See equation (D-15).) 



MANPOWER planning/capital budgetdstg 167 

The average value of all duals associated with manpower constraints averaged over 
the entire planning horizon T. (Equation (D-2).) 

The number of men of skill code rgUj which can be converted to code rgS j if the recom- 
mended budget amount is added to the current budget for such conversions. 
(See equation (D-20).) 

The number of recruits obtained in year t. 

Number of retired persons living in year t. 

Number of persons deceasing in year t. 

The vector of manpower levels existing at time 0. 

The number of men of skill code rgc required to man a unit of weapon type j in year t. 

The supply of men of type rgc available in year t. 

The vector of all manpower skill codes available in year t. 

Total number of dual variables associated with constraints indexed hy manpower 
type rgi for the entire planning horizon T. If there were n different rgi type 
skills in each year t, then Ni=nT. 

The relative payoff or utility of weapon type j in year t. P reflects reliability, depreci- 
ation, obsolescence, and mission weighting, and is discussed in formulas (B-1) 
and (B-4). 

The index indicating manpower type rate/grade/code rgc. The underlining of an 
index does not imply vector notation ; rather it indicates a single index. The triple 
designation is used to facilitate implementation. 

A multiplier obtained through the heuristic defined in formula (D-5) , which denotes 
that portion of the subbudget B*rt to deduct from operating budget B^t when 
adjusting budgets to provide more manpower. Equations (D-4) and (D-6) 
pertain. 

Analogous to i?*,- but designating the amount of increase required in budget B^t 
to reduce manpower supply. Equations (D-7) and (D-8) pertain. 

The average value of all duals associated with specialized manpower constraints 
averaged over the entire planning horizon T. (See equation (D-9).) 

The average of all duals associated with specialized manpower constraints averaged 
over year t. Equation (D-11) pertains. 

The fraction of the reallocated recruiting budget which should be applied to develop- 
ing specialty r^j in year t. (See formulas/(D-16), (D-17), and (D-20).) 

The planning horizon, ie. t=Q, 1, 2, . . ., T. 

The transition matrix for year t characterized in C-1. 

The readjusted transition rate reflecting the change in T^ necessitated by the budget 
reallocations which effect manpower supply. (See equations (D-21) and D-22).) 

The heuristic denoting the fraction of the budget reallocation to develop new special- 
ized skills rgS j which should be used to retrain specifically skill rguj. (See equations 
(D-19) and (D-20).) 



168 R. H. CLARK AND R. A. COMERFORD 

U The average value of all duals associated with unspecialized manpower constraints 

averaged over the entire planning horizon T. (See equation (D-10).) 

Ut The average of all duals associated with unspecialized manpower constraints averaged 

over year t. Equation (D-11) pertains. 

Xjt The amount of type j weapon units to be deployable in year t. 

Yrt The amount by which the recruiting budget B*rt should be shifted to or from training 

budget Bst in year t. (See equations (D-11), (D-12) and (D-13).) 

B. Optimization Phase 

Objective Function 

The objective function is assumed linear. Extensions to include diminishing marginal returns 
are appropriate but not considered here. Let 

(x,,:i=l,2, . . .,J-t=\,2, . . ., T) 

represent the set of deployable weapon systems selected for the planning horizon T, where x^ repre- 
sents the number of units of weapon type^ available in year t. Assume there are known factors Tj,, 
qjt, and Sjt representing reliability, physical depreciation, and tactical depreciation, respectively. 
These three factors are included in this discussion because reliability, physical depreciation, and 
tactical depreciation are all parameters characterizing a weapon unit. A brief digression to discuss 
how values for each can be obtained is therefore appropriate. Reliability Tjt reflects, essentially, the 
"up time" of a weapon unit. If (at time t) it requires four aircraft in the inventory to keep one 
operational, then Vji would be .25. Such data is available for most systems. 

Physical depreciation q^t represents system decay in the form of parts support. As systems 
age, repair and support costs tend to increase; thus q,, would normally be decreasing in time. 
Again, data for such depreciation should be available from logistic records. A depreciation factor 
similar to accounting depreciation could be used to obtain values of qjt for some systems if data 
is lacking. 

Tactical depreciation is a function of the potential adversary's systems. Values for Sjt would! 
have to be based on historical estimates of system tactical life. An estimate of tactical decay for 
model purposes might, if data is lacking, arbitrarily be based on exponential decay, with the decay! 
rate being a function of the relative lifetime of different systems — e.g., aircraft being useful foij 
10 years, ships for 30, radars for 6, etc. 

Assume now that there exist known measures of relative utility (in the foreseen scenarios) 
for the different weapon types. Denote these pj,. These are of more theoretical impact to the 
model than the above three factors. The pjt are the required output measures of the differen 
weapon units during the different time frames — PjiXjt would represent the raw output of the jt] 
system in period t. How does one obtain the pn? Ideally, through a Delphi technique or othe] 
opinion sampling process, one attains them from military experts. As an initial estimate, however, 
one could let pjt equal the unit cost of a system divided by the system's expected lifetime. Thisi 
method implies obvious assumptions, one being that the cost of a system is a measure of its relative; 
worth. This may not always be true, although it should be valid when a system is first produced. 
Nonetheless, this is an adequate starting point, one that lends itself to discussion by militaryj 
planners, and therefore to subsequent adjustments to more accurate values of pj,. 



I 



MANTOWER PLANNING/CAPITAL BUDGETING 



169 



Of course, the aim is to obtain objective function coefficients reflecting the relative utilities of 
different systems at different times, adjusted for reliability, and for physical and tactical deprecia- 
tion. Let these coefficients be denoted Pjf Then 

We will assume f takes the following specific form : 

(B-1) Pit=PH'rn(LjtSu 

The unadjusted effectiveness of all the weapon units available in year t, that is total defense 
sj'^stem effectiveness in year t, then becomes 



We would like to convert each year's defense capability to a "present value." If there exists some 
time preference for defense capabilities, then this can be done by an adjustment similar to time 
discounting. In fact, if D, represents the discount rate in year t, then the present value of year t's 
output becomes 



(B-2) 



1 



n (i+D,) ^ 



2-iPjt^jt 



The Vt should reflect the relative needs for defense systems at different times. This translates 

into having high values for Z), during high world tension levels, and low values during low tension 
I levels. State Department and Defense Department planners might be able to predict relative 
) tension levels- — certainly in the past the buildups to wars have been quite obvious, and estimates 
ion the return to peacetime have been made. The Z)('s ideally would reflect these estimates. If 
1 agreement is not possible, then a constant rate for Dt, such as 5%, could be used. This is similar to 

procedures \n financial discounting, where some constant rate (e.g., 10%) is used when agreement 

on the time structure of interest rates is not possible. 

The overall present value of the objective function then is the present value of the set of selected 

systems x^,. This is written: 



,(B-3) 



Present Value= y^, 



Z-iPu^u 



n (i+D,) ^ 



If mission versatility of the various systems is a factor, let Smj be a measure of the effectiveness of 
weapon type j in mission m and let w^ be the weight specifying the relative importance of mission 
m in the overall defense scenario. The objective function than becomes 



B^) 



n {i+Dd "• •'■ 
1=1 



lereafter expression (B-3) will be used to refer to the objective; maximization is understood. 



170 



R. H. CLARK AND R. A. COMERFORD 



Constraints 

1) Budget constraints: Let a^n be the amount of budget type k needed to support one unit of 
weapon type j in year t. And let 5*/ be the allocation of funds to subbudget k in year t. The Btt) 
k=\, 2, . . ., T, include only those subbudgets not directly dependent on manpower levels — 
(examples of included budgets are weapons procurement, support equipment, non-housing con- 
struction, and central supply. Excluded budgets are wages and incentives, personnel administrative 
support, medical, and housing.) The budget constraints are the following for the non-personnel 

related budgets: 

Budget Constraints Jor Year t 

Budget 1: aii,xu+ . . . +ai;,X;r-|- . . . -\-autXjt<Bu 



(B-5) 



Budget k: an,xu+ • ■ ■ +atjtXj,+ ■ • • +0'kJtXjt<Bi, 



Budget K: aKuXu+ ■ • ■ +aKjtX3t+ ■ ■ ■ +a'Kj,Xj,<BK, 

There will be one such set of constraints for each year t. The personnel related budgets excluded 
from (B-5) enter the analysis through considerations of manpower supply below. Note the sum of 
the B^t for each year t must be less than or equal to the annual budget Bt less the sum of all per- 
sonnel budgets. 

2) Manpower constraints: If nirgcjt is the amount of men of skill code rgc (representing a rate/ 
grade/specialty-code combination) required to man and support one unit of weapon type j in year t, 
and Mrgct is the estimated manpower of type rgc supplied in year t, then for year t the manpower 
constraints will be the following : 

Manpower Constraints Jor Year t 

innutXu+ ■ • • +wiiii>«aJ>«+ • • • -\-niinjtXjt<Miut 



(B-6) 



m 



TgCl 



tXu-\- 



-\-m 



TgCjf' 



t-\- . . . -\-mrgcJtXjt<Mr, 



f^J RGC ltXxi'T' ■ ■ ■ ~r ^ RGC jtXjt ~r • • • ~r1^RGCJtXjtS:-M.RGCt 

Again, one such set of constraints applies for each year t. 

3) Risk Constraints : For reasons of intuition, politics, or stability, constraints placing either lower 
limits on the weapon units for any year or upper and lower bounds on certain ratios of the variables j 
may be included. Several types of such constraints are: j 

a) Lower Bound Constraints. These reflect the concern of military planners to allow dips in 
effectiveness to occur below some minimum readiness level. The requirement that total eflfective-i 
ness in year ^+1 be at least R times that in year t, where R is a constant near 1.0, would be met byj 
the following constraint: 

(B-7) S Pi. .+ix,. .+1 >Ri: Pu^s, 

j j 



k 



MANPOWER planning/capital BUDGETING 171 

If R' is a positive constant less than 1.0, the following constraint ensures that system _7 not be 
phased out too suddenly: 

(B-8) P,,,+,Xu,+r>R'PuX,t. 

b) Upper and Lower Bounded Ratios. Military planners may feel that the relative ratios of 
different systems be kept within certain limits. This form of risk constraint can be met by the 
following condition: 

(B-9) L<^<U,L,U constants. 

Xit 

4) Technological Constraints: It may be that factors besides budget limitations prevent the 
build up of certain systems. The following constraints are alternative ways of restricting the ac- 
cumulation rate: 

(B-10) ^^^-^<R;R>1, 

Xjt 

(B-11) x,,+^<x,,+j*;j*>0. 

5) Initial Conditions : The starting point of the analysis requires that planners begin from cur- 
rent weapon configuration a;io, . . ., Xjo, • • •, Xjo, the weapon mix effective at i=0. This can be 
accomplished either by specifying the first period variables or allowing some flexibility in their 
values by constraints of the type 

(B-12) aa;;o<a;;,<6a;;o; 0<a<l<6 

Such requirements have certain real-world value. First, they are realistic, as resources are seldom 
flexible enough to achieve large short term changes. Second, the starting point of the analysis 
coincides with the situation faced by decision makers, who must decide on future courses of action 
given the current status. Analytic recommendations and decision alternatives are more likely to be 
compatible, and the gap between decision maker and analyst is narrowed. Finally, by starting at 
the current mix and restricting movements to relatively small but realistic magnitudes, the assump- 
tion of linearity in the model is less restrictive. 

C. Manpower Supply Phase 
Manpower Transition Matrix 

11) Let Mjjct be the vector of all manpower skill codes rgc for all years t. Then M,jct represents 
the right hand side of the manpower constraints (B-6) above, augmented with those manpower 
codes which are not used in any of the weapon systems in the weapon mix selected — such as recruits, 
trainees in training, and retired personnel. The manpower supply phase of MAPCAB shows how 
Mjjct is obtained and how it is modified through changes in recruiting budgets, training budgets, 
etc., to provide manpower supply most advantageous to planners within the flexibility allowed. 
The normal approach to estimating M is through a process involving a Markov transition matrix. * 



*Such an approach is desciibed by Forbes [5, pp. 93-113]. 



172 



R. H. CLARK AND R. A. COMERFORD 



Since in MAPCAB the transition rates are altered through budget manipulation and since the 
initial "state vector" will be non-stochastic, the matrix will be referred to simply as the (deter- 
ministic) transition rate matrix Tf To isolate the skill codes rgrc which are actually used to man 

systems from those manpower codes which are necessary but not used, the following additional 

notation is introduced. Let 

-Moo( = Number of recruits obtained in year t 
•^po«= Number of retired people in system in year t 
-^90f= Number leaving (quitting or deceasing) year t 
T,gc. rgc'« = transition rate from rgc to rgc' in t. 

The transition matrix Tt can then be represented : 



Mroc-i 



Mm-i 



(C-1) M, 



TgCt-\ 



M 



pOt-l 



M, 



pOt 



M, 



gOt 



J^ 00, rgc . . ■ 








. . . T , ... 

■'■ rgc. rgc' 


-^ rgc. pO 

• 


I££..«0 


... • • • 


J- pO. pO 


-^ pO. £0 



There will be one such matrix for each year t, with different values for the transitions likely. 

Given the matrices T„ the initial manpower supply vector Mrgco, and the scalar recruiting 
quantities Moot ior t = 0, 1, . . ., T—1, the manpower supply vector Mrgci for any year t can be 
determined using the following formula, which is derived in the appendix: 



(C-2) 



Mrgct=Il TMgcO+T,^ T,+,Mc 



i=l = 



j=l i=j - 



OOj 



Subsequently the transitions in Tt will be discussed in more detail to show how each transition 
is obtained. It will be found that some transitions are the result of personal preferences on the 
part of the personnel, while others are the direct result of funds made available for specific training. 
For the former, planners must rely on historical data and estimated relationships between civilian/ 
military advantages, while the latter are more directly and accurately obtained. 

It should be pointed out that the T<, i>l, are specified once given the initial manpower vector 
Mrgc o, the vector of recruiting budgets over time, the vector of training budgets over time, and 
the initial transition matrix Tq. For, given the preferences of the individuals concerned and the 
known quantities of recruits, trainees, etc., each year, the fact that the matrix rows sum to one 
determine, under certain realistic assumptions, the remaining Trjc.rjc, (i.e., those besides the 
transitions to skills, assuming fixed proportions of transitions to voluntary positions) . 



1 



MANPOWER planning/capital BUDGETING 173 

D. The Optimization/Manpower Supply Relationship 

The optimization process described above yields primal and dual solutions under proper 
conditions. The proposed improvement to the allocation process involves a heuristic based on the 
dual variables (opportunity costs) associated with the primal budget and manpower constraints. 
To describe the heuristic, further definitions and notation are needed. 

Let the subscripts u and s denote unspecialized and specialized skills. Such specialized training 
is acquired through formal training programs, such as schooling, at significant cost. A person of 
rate/grade rg who has received formal training of type s would then be classified by skill code rgs, 
while the same rate grade without specialty training would be designated skill code rgu. This 
dichotimization of the previous subscript rgc is necessary to allow taking the training budgets 
properly into account. The transitions to unspecialized skills are caused by such relatively uncon- 
trollable factors as time in rate, personal desires of individuals, and individual motivation. The 
transitions to specialty skills, however, are mainly a function of training funds available, and it is 
here that the decision makers have some real flexibility in manpower planning, for budget alloca- 
tions can be altered to provide increased or decreased transition rates to the various specialty skills. 

The general approach of the heuristic in taking this flexibility into account is the following. 
First, an overall measure of manpower value relative to capital value is obtained by an aggregate 
comparison of the opportunity costs associate with manpower versus budget constraints. A deter- 
mination is then made to either increase funds allocated to recruiting or to decrease them — the 
former if manpower duals exceed budget duals indicating a greater need for men than money. 
Then an aggregate comparison is made between the specialized and unspecialized skill codes, again 
using the opportunity costs as the comparison factor, and funds are then shifted from recruiting to 
training, or vice versa, depending on the relative values of the skills. Finally a determination is 
made as to which specialty skills should be increased or decreased, again basing the decision on the 
relative opportunity costs. These steps are now described in more detail. 

Details of the Heuristic 

1) Redistributing operating and recruiting budgets: All budgets included in the optimization 
phase will be referred to as operating budgets for convenience. Let the accounting cost of skill 
rgu be c^g^ and for rgs be c^^. These annual costs include wages and incentive pay, plus prorated 
medical, housing, and administrative costs relatable to personnel. Let drgut and drgst be the dual 
variables, or opportunity costs, associated with skill codes rgu and rgs in period t. Then 

CD-I") fl' ^"'rgU . 

Crgit 

is the opportunity cost per man converted to opportunity cost per dollar, and is then in the same 
units, and therefore directly comparable with, the opportunity costs associated with budget con- 
straints. Define the aggregate manpower related opportunity cost to be the average of all such 
Converted manpower duals for all periods t, i.e. 

:D-2) M=^ ^ 



AT _IA7 ■ 2—ll 2^ (^ rgut'TZ-l ^ rgsi\> 
iV„-|-iVj t \rju TB$ / 



174 R. H. CLARK AND R. A. COMERFORD 

where the symbol 

means the summation is to be taken over all skill codes rgi and Nt is the number of rgi skill codes 
that exist during the time span T, i.e. the number of rgi type duals. Similarly, for all budgets 5*,, 
define 

(D-3) B==~^^Y:.d,t. 

Intuitively, if M/B is greater than 1, then it would seem efficient to shift funds from operating 
budgets into manpower. Manpower can be increased by increasing recruiting rates or decreasing 
resignation rates. The former may be accomplished by increasing recruiting efforts and/or making 
service life more attractive. The latter may result from pay raises or other personnel incentives, 
such as increased medical and retirement benefits. For expository reasons, only the effects of 
alternate recruiting budgets are detailed here. Let 

=^l+y,;x>—l. 
Jo 

The quantity a; is a measure of the amount of redistribution proposed. The heuristic to be used is 
to increase recruiting budgets Brt by an amount {xjzjBru with z an appropriate number limiting 
the percentage change in the budget to some reasonable amount. In order to keep the annual 
budget Bt unchanged, some or all of the B^t must be decreased. 
Specifically, for all t, increase Bn to B,*, where 

(D-4) B*r,^Br,+ {xlz)Brt. 

To maintain the annual budget at the original level, those budgets currently least binding should 
be decreased, "least binding" meaning having the lowest opportunity costs. To do this, define 




(D-5) i?*,=— ; 

k 2j <^*(| 

I k J 

Note that 

k 

and that for small c?*,, R^t is large. Therefore the desired redistribution conditions are met by 
the heuristic 

(D-6) B%=B,-R,,B\. 

This last expression is exemplary of the heuristic used whenever a set of budgets is to be collectively 
decreased by some known amount. Were it the case that a set of budgets was to be increased, the 
expression would be 

(D-7) B\=B,a+R\tB*'u, 




MANPOWER PLANNING/CAPrrAL BUDGETING 175 

where 

(D-8) 

and B*'u is the amount the budgets were to be increased, the subscript i vice r indicating that it 
need not be the recruiting budget which is used as the base. 

2) Redistributing recruiting and training budgets rf In this subsection the details of subbudget 
allocation to balance speciaUzed and unspeciaUzed manpower opportunity costs are presented. The 
case where specialized manpower is relatively more valuable is treated here. The opposite case is 
straightforward. Let 

(D-9) s=^j:T.\i^ 

and 

(D-10) u=^^ j:\iid. 

S and U are average converted (to dollar unit) opportunity costs for specialized and imspecialized 
skill codes averaged over the entire planning horizon T. Assume S/U=l-\-y; y> — 1. If y>0, then 
funds should be shifted from developing unspecialized skills into specialized skills. This can be 
accomplished by shifting recruiting funds into training funds, especially in those years when 
specialized skills are most needed. To do so let St and Ut be the average of the adjusted duals for 
specialized and unspecialized skills for just the year t For each t, decrease B*rt by Yu, where 

(D-11) Yrt=^%f-(yB\) 

and designate the new annual recruiting subbudgets by 

(D-12) B*\t=B*r-Yr,. 

To absorb this decrease, increase each annual training budget, denoted B^t, by this same amount, 
that is, let 

(D-13) B**„=B„-\-Yrt 

where again J5** is the new budget. 

The phasing consideration mentioned in the previous footnote should not be overlooked. If 
the delay time required to convert skill rgu into rgs is very short, the above is valid. But if it takes, 
say, a year to turn a recruit into a useful labor unit, and a year to develop an unspecialized skill 
into a specialized one, then the budget redistributions must be made one, two, or even more years 



tThe current model stresses selecting budgets to alter and how much to alter them. The question of when to 
alter them is best handled through a systems dynamics approach. The authors are currently testing a feedback 
model which investigates this phasing problem. 



176 R- H. CLARK AND R. A. COMERFORD 

prior to the year used for computing Si and Uf In the illustration which follows later, a one year 
delay is assumed for converting recruits to productive labor units, and zero delay is assumed for 
converting u skills to s skills. 

3) Readjusting the manpower transition rates: This section describes how Yn is allocated 
toward developing the skills rgs. Two factors apply — the relative values of the specialties and the 
training costs associated with developing each specialty. The relative values of the specialties are 
reflected in the opportunity costs drgu- Let the training cost of converting skill rgu into specialized 
skill rgs be denoted C,g„. rgs- These costs are assumed constant over time, although the extension 
to allow changes over time are straightforward. They will also be assumed constant over u, since a 
skill rgs is developed through formal means such as schooling, and costs would be equal for all 
trainees. 

Therefore recalling the definition 

(D-14) d'r^i=^> 

Orgs 



now define 



(D-15) d" 



d' 



TgSt 



rgst p 

^rgu, rgs 



Here d" is & weighted measure of the various opportunity costs, the weighting now being formed 
with respect to the relative accounting costs of the personnel and of the skill transiton costs. A three 
stage approach is followed. First it is determined how much of the Y^t should go toward training 
each specialty rgs in year t. Then it is determined how that training should be distributed among the 
rgu skills, that is which rgu codes can best afford to be converted into rgs codes. Finally the transi- 
tion rates Trgu. rgst are adjusted to maintain consistency in the transition matrix T,. These three 
phases are now discussed separately. 

Phase I — Distributing the Y^ among specialties: Let rgst be a specific skill code from the set 
of codes rgs. Then let 

^ d"rgiit 

(D-16) ^'='tT~: 

i 

Then Si represents the fraction of F„ to allocate toward the development of specialty rgst. (Note 
again Si sums to unity and a large d" implies a large allocation to the manpower type it is associ- 
ated with). The amount allocated to training skill codes rgSi in year t is 

(D-17) SiYr,. 

Phase II — Determining which unspecialized skill codes to convert into rgst codes: The funds 
represented by expression (D-17) are spent in converting some r^u personnel into rgst personnel. 
The specific codes to convert will be those in excess supply, or having the lowest opportunity cost. 
Let rgu j be a specific skill code from the set rgu, where rgu is now limited to those unspecialized 
skill codes which can be converted to rgst. If, analogously to (D-14), 

(D-18) J' _dr^ 

drgu 



I 




MANPOWER planning/capital BUDGETING 177 

then a small d' indicates a good prospect for conversion to rgst. Therefore let 



(D-19) 



and allocate Uj times the total added expenditure on obtaining new rgst skills on converting rgu j 
personnel into rgst personnel — that is allocate UjSiYrt toward converting rgUj into rgs^t (in addition 
to what was previously spent on the same conversion) . 

Phase III — Adjusting the transition rates: Each allocation UjSiYn can convert 

(D-20) ^^J^^=m,, 

L/rgu, rgti 

men of skill code rgUj into rgsi. Therefore the transition rate Ttqu^tqu for year t must be increased 
to reflect this additional conversion, and the remaining transitions from rguj into other skill codes 
(specialized or unspecialized) must be adjusted to ensure the sum of the rows of Ti equal unity. 
Since it can be assumed that the rgUj personnel being converted would otherwise have transitted to 
other unspecialized skills (in fact, most would have stayed rgUj or advanced a grade to r, ^+1, Uj) 
the entire adjustment would be absorbed if the following facts are utilized. First the new transition 
rate Trguirgsj must be the ratio of the total number of rguj to rgSi conversions performed in year t 
divided by the total manpower of type rgu , available at the end of year t—l. Therefore the new 
transition rate is 

v-L'"^!) J- rgujrgtit^^^ J- TgUjTgiit 



rguirgtii i t,^ 

IVIrguit-i 

The new transition rate from rgu, to rgu^ , where k indexes the unspecialized skills which can be 
converted into rgs j, is not usually controllable, (since persons choose to become specialized, rather 
than are forced to do so), and the most reasonable assumption is that the conversions come at the 
cost of rgujc skills in the same ratios as the previous conversions to such skills. More specifically, 

(D-22) r* —T \-^ "^^' 

' ^ TgUirju,l-lrgu,rjud-\-^T^^ Mrgu^uA' 



where, for convenience, 



E. The Iterative Process 



■L ik — -irguirgukt- 



The heuristics developed in the last section allow obtaining new transition matrices T*, which, 
using the time zero manpower supply vector Mrgco and the new estimated numbers of recruits for 
each year t, yield new right hand sides for the optimization phase, f These new right hand sides 

t Equation (C-2) yields the new manpower levels. 



178 



R. H. CLARK AND R. A. CX)MERFORD 



should jdeld an improved solution without violating the annual budget limitations. If, on rerunning 
the optimization phase with these new constraints the solution becomes infeasible, the iteration 
should be repeated with each reallocation indicated by the heuristics decreased by a fraction, such 
as halved. If the solution results io a decrease in the objective, which may happen if the realloca- 
tions are so large that the optimum is passed, the reallocations should also be halved. If the steps 
are small enough in each iteration, the objective should show small improvements each iteration, 
until no further significant improvement is obtainable. This point will naturally be reached if the 
dual variables are approximately equal, or the limitations of the problem flexibility in terms of 
budget changes or manpower development is reached. Undesired occurrences, such as cycling or 
as3Tiiptotic behavior without significant improvement, should be easily analyzed due to the se- 
quential natxire of operations. This would not be the case if the problem was converted to a one 
phase optimization program. 

The iterative scheme used here is a modification of the decomposition technique formulated 
by Kornai and Liptak in [8]. Computational forms of large scale appUcations of the above man- 
power planning model can benefit from the algorithm they originated. 

F. An Illustration 

Assume a hypothetical defense structure having three types of weapon units denoted A, B, 
and C. Also assume a four year planning horizon and a simple manpower structure consisting only 
of recruits, unspeciahzed skills u, and specialized skills s. The annual defense budget S, is constant 
over time. The annual discount rates to be used are obtained from intelligence experts (if war is 
irominent, Dt is high ; for distant years it could stabilize at, say, 5 percent) . Also, the relative worth 
of the different weapon types is estimated by experienced military planners, as discussed. 
Kjiown Values : 

• Annual budget=4,851 billion dollars. 

• Relative net payoffs (P„) : A: B:C= 100: 10:25 

• Discount rate: Z?,= (0.10, 0.08, 0.05, 0.05). (Reflecting decreasing tension levels.) 

• System requirements per unit: 



B 



C 



Operating budget (billions) 

Manpower type u (thousands) 

Manpower type s (thousands) 

Cost of training skill code s : $3030 per man. 

Accountiag cost per man : 
• Planned budget allocations: 



0.16 


0. 0183 


0.0 


4.0 


0.3 


0.655 


1.9 


0.16 


0.3 



recruit 
$5000 



u 



$8000 



Year 



$10000 



recruiting ($ billions), 
training ($ billions)-- 



0.045 


0.045 


0.045 


0.045 


0.045 


0. 045 


0.045 


0.045 



MANPOWER planning/capital BUDGETING 



179 



• Recruiting cost: $1000 per recruit. 

• Trainees: From training budget and cost per trainee 14,850 s-type personnel are trained 

each year. Initially 4,500 of these come from recruit types and 10,350 come from u types. 

• Initial transition matrix Ti : 



u code 



s code 



resigned 



0. 9 0. 1 





0.35 0.15 
0.5 


0.5 
0.5 



recruit 

u code 
s code 



• Initial values: 

recruits=45,000 
wcode =69,000 
s code =34,000 
weapon units: 

type ^=10 

type 5=50 

type (7=20 

• Restrictions: No weapon system can be decreased by more than 10% in any year. 

Problems to be Considered : 

With the above facts two problems will be solved. In the first, manpower constraints will be 
included in the operating budget rather than accounted for separately. This is the normal approach, 
with each system being charged for the manpower required to man and support it. The second 
problem, using the same data but explicitly accounting for manpower constraints, is solved, and a 
totally different solution, resulting in a completely different allocation of resources, results. The 
second problem is then revised through the heuristic rationale discussed to first yield an improved 
solution, and then to show what happens if the reallocation is carried too far. The data for these 
problems are not precise, but reasonably approximate figures of actual weapons used in the U.S. 
Defense arsenal to show the impact of using the wrong rationale. 



Solutions: 

Using formula (B-3) the objective function to be maximized is 
(G-1) 90.9x^i+9.09a;i,i+22.7a;ci+84x^2+8.4a;B2+21.01xcj 

+80xx3+8Xfl3+20xc3+76.3xx4+7.63Xfl4+19.08xc4. 

The manpower supply of codes u and s, using the initially planned constant recruiting and 
training budgets of 45 million dollars per year are (in thousands of men) 



180 R- H. CLARK AND R. A. COMERFORD 



(G-2) 







Year 






1 


2 


3 


4 


u code 

s code 


__. 64.65 
... 31.85 


63. 13 
30. 12 


61.96 
29.66 


60.95 
29.86 



PROBLEM 1 : Systems costed out with manpower implicit in operating costs. 

The total annual budget of 4.851 billion dollars must be adjusted by deducting the annual 
training budget, the annual recruiting budget, and the annual wages paid to recruits who do not 
man or support any system. (The same would be true of retirees, who do not exist in this problem.) 
The available budget then is 

4.851 -0.090 -(45,000Xrecruit wage)=4.536 billion dollars. 

Since manpower costs are to be included in the system operating costs, the budget factors for each 
weapon unit must be adjusted. For example, system A costs must be increased on a per unit basis 
by the cost of paying the skills used by the system, 4000 men times $8000 or .032 billion for the 
u coded personnel used, and 1900 men times $10,000 for the s coded personnel used. The per unit 
cost of system A is then the original .16 plus .032 plus .019 billion dollars. The new per unit costs are 

system A=0.211 

system B = 0.0223 

system C= 0.0628 

The appropriate budget constraints for years 1 through 4 thus are 

(G-3) 0.211a;^, + 0.0223xBr + 0.0628a;c,<4.536; <=1, ... 4 

The restriction that no system be depleted by more than 10% in any year is represented by 

X;,>0.9a;_,,_i 

or 

0.9x^,_i— lxj,<0; j=A, B, C 

t=2, 3, 4 

and 

a;Ai>8; Xbi>45; Xci>18. 

The optimum solution is presented in Table 3 for comparison with subsequent solutions. The 
matrix representation of the problem is presented in Table 1. 



MANPOWER planning/capital BUDGETING 
Table 1. — Problem 1 



181 



Variables : 



Xa\ 


Xbx Xci 


Xa2 


Xb2 


XC2 


Xa3 


Xb3 


XCi 


XAi 


Xsi 


XC4 


Objective : 


Maximize 




















90.9 


9. 09 22. 7 


84. 


8.4 


21. 


80. 


8. 


20. 


76.3 


7.63 


19. 1 



Subject to: 



0.21 


0. 023 0. 068 


0.21 


0. 023 0. 068 


0. 21 


0. 023 0. 068 


0. 21 


0. 023 0. 068 


< 

< 
< 
< 


4.536 
4.536 
4.536 
4.536 


.9 


.9 

.9 


-1. 
.9 


-1. 

-1. 

.9 

.9 


-1. 
.9 


-1. 

-1. 

.9 

.9 


-1. 


-1. 

-1. 


< 

< 
< 
< 
< 
< 
< 
< 
< 













1. 


1. 

1. 














> 
> 
> 


9.0 
45. 
18. 



PROBLEM 2: Manpower supply explicitly accounted for.* 

In this case the annual budget constraints must be adjusted to reflect not only the recruiting, 
training, and recruit wage costs, but also for all other wages paid. Doing this yields budget con- 
straints of 3.7, 3.73, 3.744, and 3.75 for years 1 to 4. The explicit budget and manpower constraints 
are shown in Table 2, and the solution is in Table 3. The important thing to notice is that the 
Problem 1 solution calls for increasing system A over time and decreasing system C, while Problem 2 
calls for the opposite. This disparity results because in Solution 1, since manpower and money 
are treated interchangeably, the systems selected are those with a low per unit operating cost (per 
unit of payoff). But while the solution is feasible in terms of money, note it is infeasible in terms 
of manpower — for in year 1, as an example, the number of s personnel to man the selected systems 
are 11.38(1.9) +45(0.16) + 18(0.3) = 34.222 thousand men. Yet the total available supply of such 
men is only 31,850. If the solution is accepted, and the systems bought, they will be undermanned, 
not an uncommon defense occurrence. 



♦The three phases of Problem 2 are represented by the constraints of Table 2. 



182 



Variables : 



R. H. CLARK AND R. A. COMERFORD 
Table 2. — Problem 2 



Xai 


^Bl Xci 


XAi 


XBi 


Xci 


XAi 


XBi 


XC3 


XAt 


XBi 


Xci 


Objective: 


Maximize 




















90.9 


9. 09 22. 7 


84. 


8.4 


21. 


1 80. 


8.0 


20. 


76.3 


7.63 


19. 1 



Subject to 



0, 16 . 018 . 055 
4. 0. 30 0. 60 

1. 9 0. 16 0. 30 








< 
< 
< 






0. 16 0. 018 . 055 
4. 0. 30 0. 60 

1. 9 0. 16 0. 30 






< 
< 
< 


B2 






0. 16 . 018 . 055 
4. 0. 30 0. 60 

1. 9 0. 16 0. 30 




< 
< 
< 


B, 

M., 








0. 16 0. 018 0. 055 
4. 0. 30 0. 60 

1. 9 0. 16 0. 30 


< 
< 
< 


B4 

M.., 


0.9 

0.9 

0.9 


-1. 

-1. 

-1. 
0.9 

0.9 

0.9 


-1. 

-1. 

-1. 
0.9 

0.9 

0.9 


-1. 

-1. 

-1. 


< 
< 
< 
< 
< 
< 
< 
< 
< 













1.0 

1.0 

1.0 








> 
> 
> 


9.0 
45.0 
18.0 



Right Hand Sides: 
(Constraints 1-12) 



ignation 


Prob 2 


Iter 1 


Iter 2 


Br 


3. 7 


3.528 


3.474 


M„, 


64.65 


68. 14 


71.6 


M., 


31.85 


38.36 


40. 4 


B, 


3.73 


3.51 


3.453 


M«, 


63. 13 


67.87 


69.66 


M., 


30. 12 


40.38 


41.34 


B» 


3.744 


3.50 


3.411 


M„3 


61.96 


67.72 


73.9 


M., 


29.66 


4. 14 


42. 1 


B« 


3.75 


3.496 


3.415 


M„4 


60.95 


67.68 


73.05 


M,i 


29.86 


41.89 


42.45 



MANPOWER planning/capital. BUDGETING 



183 



PROBLEM 2, ITERATION 1 



If the dual variables of the solution to Problem 2 are investigated, it becomes apparent that 
manpower is more valuable than money. Furthermore, s skills are more valuable than u skills. If, 
therefore, the recruiting budget is first increased by 67% each year, and 67% of this increa.se is 
directed toward converting more recruits to s skills, the resulting manpower levels and cost changes 
result in the solution shown in Table 3. Notice that the objective value has increased, but no annual 
budget was changed. Further refinements are possible, an obvious one being to shift some training 
funds into recruiting since of the manpower related duals, only the u skills are now binding 



Table 3. — Solutions 
Problem 1 Problem 2 Iter 1 



Iter 2 



Objective 




6800 


6430 


6536 


6450 


Primal Variables 


Xai 


11.38 


9 


10.71 


10.38 




Xbi 


45 


45 


45 


45 




Xci 


18 


25.17 


18 


18 




aUi 


12.4 


8.1 


11 


11.3 




Xaa 


40.5 


40.5 


46.5 


40.5 




Xc2 


16.2 


27.5 


16.2 


16.2 




Xas 


13.3 


7.29 


10.7 


12. 14 




Xb3 


36.4 


36.45 


53.8 


36.45 




Xc3 


14.6 


33.25 


14.6 


14.58 




Xa* 


14.1 


6.56 


10.4 


13.08 




Xb* 


32.8 


44.29 


60.7 


32.8 




Xc* 


13. 1 


34.36 


13.12 


13. 12 


Dual Variables 


Constraint 










1 







568 


568 




2 


Not 













3 


Applicable 


75.7 










4 







333 


525 




5 







7.67 







6 




70 










7 







3.7 


500 




8 







7.3 







9 




66.7 










10 




230.7 


302. 8 


477 




11 







6.96 







12 




21.29 









184 R. H. CLARK AND R. A. COMERFORD 

PROBLEM 2, ITERATION 2. 

! 

If the reallocations of budgets are carried too far, the increase in the objective may be less than 
for smaller readjustments. If the Problem 2 adjustments are 73% instead of the 67% made in 
Iteration 1, then the results are as shown in Table 3. The objective is less than for Iteration 1, and 
note that only the budget constraints are binding, as all other duals are zero. Manpower is now less 
valuable than money, and funds should be shifted into operating budgets and out of recruiting/ 
training. 

III. APPENDIX 

Derivation of Equation (C-2) : 

Were it not for the scalar Mooi denoting the new recruits each year, the manpower vector re- 
quired for the right hand side of the manpower constraints, Mrgct, would be obtained as follows: 

(III-l) Mr,ct=U TMncA 

i=l = 

But this expression must be modified since the recruit-to-recruit transition rate is zero and new 
recruits arrive each year. That equation (C-2) holds will be shown in an inductive proof. First, for 
year 1, by definition 

Since Too.ooo the transition of recruits to recruits is zero for each year t, Mrga will have its first 
element equal to zero. To obtain a valid manpower supply vector in year 1 the recruits arriving in 
year 1 must be added to Mrg-A. Define Moot, to be the column vector 



and let 



For year 1 , 



And, therefore, 



(Moot, 0,0,. . .,0), 



M\gct=Mrgct+Moo,;ioT all t. 



M'rga = Mrgcl + Mooi, 



M' r,a = TiMrgcO+Mooi. 



fSince Mrgci — Tl Mrgco and Mrgci= TiMrgg-i, for all i from 2 to T, (7—1) is true by induction. Also, T does 
not include the final column of (C— 1), nor does M include the number of rcsignees since resignees are external to 
the system. 



MANPOWER planning/capital BUDGETING 185 

The next year's manpower vector can now be obtained: 

=T,{T,Mr,co+Mm) 
=T,T,Mr,co+T2Moou 
and 

Proceeding in similar fashion, 
and 

It is clear by continuing that the expression for Mrgct given in (C-2) is valid for the general case. 
The total manpower supply vector is then Mrgct augmented by the number of recruits entering in 
year t. 

BIBLIOGRAPHY 

[1] Arrow, Kenneth J., and Mordecai Kurz, Public Investment, The Rate of Return, and Optimal 
Fiscal Policy. (Baltimore: The Johns Hopkins Press, 1970). 

[2] Carleton, Willard T., "Linear Programming and Capital Budgeting Models: A New Inter- 
pretation," The Journal of Finance, 825-833, (December, 1969). 

[3] Chames, A., and W. W. Cooper, Management Models and Industrial Applications of Linear 
Programming (New York: John Wiley & Sons, 1961). 

[4] Chames, A., W. W. Cooper and R. J. Niehaus, Studies in Civilian Manpower Planning 
(NAVSO P-3540) (Washington, D.C.: Navy Department, Office of Civilian Manpower 
Planning, July 1972). 

[5] Forbes, A. F., "Markov Chain Models for Manpower Systems," In Manpower and Manage- 
ment Science, pp. 93-113. Edited by D. H. Bartholomew and A. R. Smith (Lexington, 
Mass.: D. C. Heath & Co., 1971). 

[6] Hirshleifer, J., "On the Theory of Optimal Investment Decisions," Journal of Political Econ- 
omy, 329-352 (August, 1958). 

[7] Hitch, C. J., and R. N. McKean, The Economics of Defense in the Nuclear Age (Santa Monica, 
Calif.: The RAND Corporation, 1960). 

[8] Komai, J. and T. Liptak, "Two Level Planning," Econometricas, 33, 141-169 (1965). 
• [9] Quade, E. S., and W. Boucher, eds. Systems Analysis and Policy Planning — Applications in 
Defense (New York: Elsevier, 1968). 
[10] Weingartner, H. Martin, Mathematical Programming and the Analysis of Capital Budgeting 
Problems (Chicago: Markham Publishing Co., 1967). 



AN F APPROXIMATION FOR TWO-PARAMETER WEIBULL AND LOG- 
NORMAL TOLERANCE BOUNDS BASED ON POSSIBLY CENSORED DATA* 



Nancy R. Mann 



Rockwell International 
Thousand Oaks, California 



ABSTRACT 



An approximation suggested in Mann, Schafer and Singpurwalla [18] for obtain- 
ing small-sample tolerance bounds based on possibly censored two-parameter 
WeibuU and lognormal samples is investigated. The tolerance bounds obtained are 
those that effectively make most efficient use of sample data. Values based on the 
approximation are compared with some available exact values and shown to be in 
surprisingly good agreement, even in certain cases in which sample sizes are very 
small or censoring is extensive. Ranges over which error in the approximation is 
less than about 1 or 2 percent are determined. The investigation of the precision of 
the approximation extends results of Lawless [8], who considered large-sample 
maximum-likelihood estimates of parameters as the basis for approximate 95 per- 
cent Weibull tolerance bounds obtained by the general approach described in [18]. 
For Weibull (or extreme-value) data the approximation is particularly useful when 
sample sizes are moderately large (more than 25), but not large enough (well over 
100 for severely censored data) for asymptotic normality of estimators to apply. 
For such cases simplified efficient linear estimates or maximum-likelihood estimates 
may be used to obtain the approximate tolerance bounds. For lognormal censored 
data, best linear unbiased estimates may be used, or any efficient unbiased estima- 
tors for which variances and covariances are known as functions of the square of 
the distribution variance. 



1. INTRODUCTION 

In the following we consider tolerance bounds for two-parameter Weibull and lognormal 
distributions, or equivalently, extreme-value (also known as Gumbel) and normal distributions, 
respectively. In the context of reliability, when the distributions are failure-time populations, then 
the tolerance bounds are confidence bounds on reliable life tp, a population percentile corresponding 
to a specified survival probabiHty lOOR and P=l— J?. 

We assume that there may be type-II censoring of the data. In a life-testing context, this 
means that the life test applied to a sample of size n is terminated at the time of the rth failure, 
r^n. We order the observable variates from smallest to largest and call their logarithms Xi. „, . . . , 



*The research documented herein was supported by the Air Force Office of Scientific Research, AFSC, USAF 
under Contract No. F44620-71-C-0029. 

187 



188 N. R. MANN 

Xt. n- Thus if the data are Weibull (lognormal), the sample of unordered X's is a sample of variates 
from the extreme-value (normal) distribution. In either case the distribution of the X's has a 
location parameter ;u and a scale parameter a. If the ^'s are extreme-value variates, their density 
function has the form 

(1.1) U{x)=\ exp [-exp (^)] exp (^) 

If the X's are normal, their density function has the familiar form 

For Weibull or extreme-value data, if censoring is not extensive, one can use Monte Carlo- 
generated tables of Thoman, Bain and An tie [22] and Billman, An tie and Bain [2] to obtain tolerance 
bounds. The tables are used in conjunction with iteratively obtained maximum-likelihood estimates 
of distribution parameters. If sample size n is 25 or less, one can determine tolerance bounds from 
tables of Mann and Fertig [14], Mann, Fertig and Scheuer [16], or Mann, Schafer and Singpurwalla 
[17] for r=3(l)n. These tables have also been generated by Monte Carlo procedures. They require 
the use of best linear invariant estimates 

r T 

2—1 ^i. r.n^i, n and > . Cj_r, n2;j, „ 

of the parameters y. and <t, with values of (aj, r. «} and (c,, ,. n] available in Mann [11, 12] and in [18]. 
Results of Engelhardt and Bain [4], Mann and Fertig [15], and Thoman, Bain and An tie [22] 
indicate that maximum-likelihood estimators and best linear invariant estimators give very nearly 
the same results for the extreme-value distribution, even for small sample sizes and rather extreme 
censoring. Moreover, for extreme-value data, the distributions of these two types of parameter 
estimators are very nearly the same. 

For lognormal data that are not censored, tolerance bounds can be determined by simply 
disregarding the order of the observations, Xi, „ . . . , a;„. „, and calculating x and s to be used with 
tables of the noncentral ^-distribution. 

The approximation to be described in Section 2 is suggested in [18] as a method for obtaining 
tolerance bounds under conditions of sample size or censoring for which the Monte Carlo generated 
tables or the noncentral ^-tables described above are not applicable. The approximation is based on 
efficient linear estimators such as best linear invariant (B.L.I.) estimators or linear transformations 
of these estimators which are best linear unbiased (B.L.U.). One can also use ma.ximum-likelihood 
(M.L.) estimators or efficient approximations to the optimum linear estimators, as long as informa- 
tion concerning variances and covariances of unbiased versions of the estimators is available. All of 
the B.L.I., B.L.U. and M.L. estimators contain essentially all the information in the sample that 
can be used in making inferences about the distribution parameters (see, for example, Lawless [7]). 
The approximate lower tolerance bounds described below are thus essentially the most accurate 
available (have, in effect, the minimum probability of falling below any percentile less than the 
true percentile of interest) among bounds independent of the parameters (see Lehman [14, p. 78]). 



i 

WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 189 

2. THE F APPROXIMATION 

The result that provides the basis for the F approximation is that of Pyke [20] applying to any 
difference of adjacent ordered observable variates (called a "spacing" by Pyke) from a continuous 
distribution. Pyke showed that as sample size increases, the distribution of each such difference 
approaches that of a weighted exponential variate, or equivalently, that of a weighted chi-square 
with 2 degrees of freedom. Recently van Montfort [23] observed that any spacing 

(2.1) Ht=Xi+i,m—Xi,,n, t=l, . . . ,n, 

from the extreme-value distribution, when divided by its expectation, has approximately an ex- 
ponential distribution with mean 1 and variance very near to 1 [thus 2Hi/E{Hi) is approximately a 
chi-square with 2 degrees of freedom] even for sample size n as small as 3. Also, van Montfort ob- 
served that for a size-n sample of extreme-value variates with n as small as 3, the co variance be- 
tween Hi and Hj, xt^j, is approximately zero. Pyke [20] showed that in general Hi and Hj are 
asymptotically independent for pi=iln and Pi^j/n fixed with increasing n. Using tables, in Sarhan 
and Greenberg [21], of expectations, variances and covariances of reduced order statistics from the 
normal distribution, one can see that the properties observed by van Montfort for the extreme- 
value distribution hold for Gaussian spacings for sample sizes about 6 or larger. 

Mann and Fertig [15] combined these results with that of Box [3], demonstrating that linear 
combinations of chi-squares are weighted chi-squares. They showed that, for sample size greater 
than about 3, any efficient unbiased linear estimator a*r,n of the extreme-value scale parameter 
with variance Cr. nO'^ is such that a*r, „/o- is approximately a chi-squared variate over its degrees of 
freedom with 2/Cr, n degrees of freedom. Here the authors used the two-moment fit of Patnaik [19] 
of a weighted chi-square 

o-*r. «=S ^i(Xi+i,„—Xi,n) with mean m=(x and variance v=Cr.m<^'^ 

to a chi-square {2m<T*r.„/v with 2m^lv degrees of freedom). Results of Grubbs, Coon and Pearson 
[6] and Fertig and Mann [5] indicate that under certain conditions this result is applicable also to 
efficient unbiased linear or maximum-likelihood estimators of Gaussian scale parameters. 

Now for either an extreme-value or normal distribution, consider an efficient unbiased linear 
, estimator M*r. » of m having variance ^r. no-^ and covariance -B^, „o-^ with cr*r, „. The estimators 
M*r, n and <T*r. „ are best linear unbiased estimators or efficient simplified approximations thereof. 
Form the statistic X*r.n=tJ^*r.n — {Br,„/Cr.n)<^*T.n, which can easily be seen to have covariance 
AB-{B/C)C] = Ovnth a*,. „. Let Xp be the lOOPth percentile of the distribution of X, with P=l-R. 
It is for Xp, or exp(xp), that a lower confidence bound is desired. The parameter Xp is of the form 
n-\-Zp(T, where if X is a normal variate, then Zp is the lOOPth percentile of a standard normal distri- 
bution with mean zero and variance unity. If X is an extreme-value variate, then, from (1.1), 
Zp=Mn[il-P)-']. 

If one now forms X*r, „— Xp, as suggested in [18], it can be seen that the expectation Ei of this 
difference is given by £'i = [(— J5,. JCr. n)—Zp]a. It can be shown that if P is sufficiently small for a 
specified r and n, {X*r, n—Xp)/Ei is, with high probability, a positive random variate. Hence one 
can combine results of Fertig and Mann [5] and Mann [12], applying to prediction intervals, with 
previous empirical results of [8, 18] to infer that for appropriate combinations of r, n and p, 



190 N. R. MANN 

(2.2) F,= (Z*,„-Xp)/[cr*(-J5..„/(7,„-2p)], 

has an approximate /^distribution. The precision of the approximation is investigated in Section 3. 
The number ui of degrees of freedom for the numerator X*r, „—Xp of Fi can be obtained from 
Patnaik's [19] two-moment fit: (X*,, „— a;p)/m, with m=E^, is approximately a chi-squared variate 
over its degrees of freedom, and its number of degrees of freedom is equal to 2m'^lv, where v is 

(2.3) Var (X*) = (yl,„-5^/C,n)<r^ 

Therefore, the numbers of degrees of freedom for the approximate T^-variate Fx are 

(2.4) V, = 2{-B,JCr.n-Zprl{Ar.n-B'r.nlCr.n) 

and 

(2.5) ^2-2/C,n. 

We can now determine, from (2.2), an approximate IOO7 percent lower confidence bound for 
tp =exp {xp) as 

(2.6) exp {,^*.n-{Br.nlCr,n-{Br.nlCr.n + Zp]Fy{vu V2)]<T*.n] 

where Fy{vi, V2) is the lOOith percentile of an F distribution with vi and V2 degrees of freedom. 

If vi and V2 are not integers, which will generally be the case, then one can interpolate in tables 
of percentiles of the F distribution or use an approximation given by Mann and Grubbs [17]. 

If Zp =0 and Br, JCr, n is positive {rin is small), then clearly a lower confidence bound at level 7 
for exp (m) is obtained by substituting Fi^yiv^H) for Fy{vi,v2)'^Fi^y{vi,v2) in (2.6). Using Fy{vi,V2) 
in (2.6) when Br, JCr. n is positive will 3'ield an upper confidence bound for exp (m) at level 7. 

If best linear invariant estimators ^r, n—<y*T. „/(l + Cr, „) and m^, ri = ii*—Br, n^r. n have been used 
to estimate n and a, then one can use 

(2.7) exp {'^^,^-{.{Br.„+{l+Cr.rdUpFyil'u U2)-{l~Fy{.u U2)]Br,JCr.n]}hn} 

to determine an approximate IOO7 percent lower confidence bound for tp. Here values of At, „, Br, „ 
and Cr,„ can be determined from tabulated values (see [11, 18]) of E{Ln) and E{La), the mean 
squared errors of ju/o- and ?/(r, respectively, and the expected cross product E{CP)—E{{yL—n) 
(ff— o-)]/(r^. To do this, one uses the following relationships: 

(2.8) Cr.n=E{Lcj)l[\-E{La)] 

(2.9) Br,r.=E{CP)l[\-E{La)] 
and 

(2.10) Ar.r.=E{Ln) + [E{CP)YI[\-E{La)]. 

For large sample sizes, simplified efficient unbiased linear estimators of /j and a (see [1], [4] and 
[15]) can be used in (2.6) with corresponding values of ^r. „, Br,n and C,, „ = fer. «• Also, as will be 
shown in Section 3, maximum-likelihood estimators of m find a can be substituted for /[ir,n and a„n 
in (2.7). In the latter case, values of Ar, „, Br, „ and Cr. n corresponding to linear estimators are still 
appropriate, where available. 



WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 191 

3. PRECISION OF THE APPROXIMATION 

Several examples are given in [18] showing the excellent agreement of Weibull tolerance bounds 
based on (2.7) with exact (Monte Carlo) tolerance bounds obtained from tables of [14, 16]. We 
show now some further examples, among several cases investigated in this present study applying 
to both Weibull and Gaussian data. And we attempt to draw some guidelines regarding the pre- 
cision of the approximation. 

3.1. Weibull Tolerance Bounds 

In the tables below, values of P are those considered in [13], i.e., P=.01, .05, .10 and P—l—e~- 
(corresponding to Zp=0, which implies the confidence bound is for expin), the Weibull scale param- 
eter, or characteristic life). For each combination of values of P, r and n, the corresponding vi is 
I displayed. Since V2 depends only on n and r, it is exhibited only once. Exact table entries Vp^y are 
among those found in [14, 16], and for 7>:.5 such that 

P[Xpyflr, n — Vp,yffr, n]=7- 

More generally Vp^ is the lOOyth percentile of Vp=(/i)tr,„— Xi.)/^,. n- The approximate values of Vpy 
are calculated from the coeflBcient of a^, „ in (2.7). 

Values of Ar, n, Br, „ and Cr. n have been obtained by the linear transformations (2.8), (2.9) and 
i (2.10) given in Section 2. The transformations are applied to E(Ln), E(La) and E{CP), which are 
values proportional to expected squared errors and cross products for best linear invariant estima- 
tors, tabulat^.d in [11, 12, 18]. The approximation described in [17] has been used in each case for 
■^t(>'i, vz) since vi and v^ are, in general, not integers. 

Table 1 gives a representation of 3 out of 15 combinations of sample size of n and censoring 
' number r investigated in the present study. The combinations of n and r for the 15 cases were 
71=5(5)20, r=5(5)n. and 7i=24, r=5(5)20, 24. Approximate values in Table 1 in error by 2 per- 
cent or more are bracketed. 

Lawless [8] used the F approximation as described in [18], with tables of Ar, „, Br. » and Cr. » 
applying to simplified efficient unbiased linear estimators, in conjimction with maximum-Ukelihood 
, estimators. In other words, he substituted maximum-likelihood estimators for simplified versions 
of best linear invariant estimators in (2.7). He then compared values for obtaining 95 percent 
tolerance bounds based on (2.7) with exact values computed for maximum-likelihood estimators 
using numerical integration procedures. The cases Lawless considered involved large values of n, 
I n=25, 40 and 60 and values of r/n ranging from .1 to .9. The values of P he used were .05 and .10. 
For all of the 20 cases investigated by Lawless for which r/n^.3, the difference between the 
; 95 percent tolerance bounds based on the F approximation and the exact tolerance bounds based 
ion the numerical integration procedures is within about 1 percent. For r/n—A or .2 and P=.05 
I the difference is within 3 percent. For very extreme censoring {r/'n—A) the approximation (2.7) 
I gives extremely poor results for P=.10, apparently because of the small size of vi for these cases. 
For the cases investigated in the present study, which clearly involved smaller values of 
sample size n (and linear, rather than maximum -likelihood estimators), the values of Vp.y for ob- 
taining lower confidence bounds oli 1st percentiles tended to be within 1 percent of exact (Monte 
Carlo) values except for samples of sizes 5 and 10. ForP=.05 and .10 there were more discrepancies 



192 



N. R. MANN 



Table 1. — Approximate (from {2.7)) and Exact lOOyth Percentiles Vp^y of Vp=(yLr,n—'^)l'^r.n jor 

Extreme-Value Data 



Approximate Values from (2.7) 


Exact Values from [13] 


P=l-e-^ 


. 10 


.05 


.01 


P= 


=l-e-i 


. 10 


.05 


.01 






71=24, r 


•=5, 1/2 = 8.46 










\ 
7\»'i 


23.93 


3.50 


16.28 


79.83 


tXi-i 


23.93 


3.50 


16.28 


79.83 


.02 


<-6. 704> 


<1. 691> 


<2. 140> 


3. 152 


.02 


-6.87 


0.91 


1.90 


3. 12 


.05 


<-4. 469> 


<1. 737> 


<2. 279> 


3.441 


.05 


-4.59 


1.36 


2. 17 


3.41 


. 10 


-3. 090 


<1. 799> 


<2.433> 


3.755 


. 10 


-3. 13 


1.64 


2.38 


3.73 


.50 


<-0.490> 


<2. 301> 


<3. 351> 


5.575 


.50 


-0.47 


2.45 


3.40 


5.59 


.90 


0.585 


3.879 


5.606 


9.943 


.90 


0.58 


3.85 


5.64 


10.09 


.95 


0.763 


<4. 802> 


6.808 


12. 250 


.95 


0.76 


4.50 


6.72 


12.23 


.98 


0.929 


<6. 377> 


8.762 


15. 980 


.98 


0.92 


5.60 


8.52 


15.75 






71=24, ;•= 


= 10,;'2=20.32 










\ 


12.75 


38.92 


87.92 


271.66 


TV*! 


12.75 


38.92 


87.92 


271. 66 


.02 


<-1.720> 


<1. 552> 


2.051 


3. 126 


.02 


-1.67 


1.47 


2.01 


3. 11 


.05 


-1.228 


<1. 673> 


2.221 


3.408 


.05 


-1.20 


1.63 


2.20 


3.39 


. 10 


-0. 877 


1.797 


2.393 


3.695 


. 10 


-0.87 


1.78 


2.38 


3.69 


.50 


<-0. 060> 


2.415 


3.242 


5. 101 


.50 


-0.08 


2.44 


3.25 


5. 10 


.90 


<0. 373> 


3.501 


4.715 


7.532 


.90 


0.39 


3.51 


4.73 


7.55 


.95 


<0. 454> 


3.952 


5.324 


8.535 


.95 


0.49 


3.94 


5.33 


8.52 


.98 


<0. 530> 


4.575 


6. 163 


9.917 


.98 


0.60 


4.51 


6. 13 


9.94 






n=10, r= 


= 10,1/2=27.94 










i\v\ 
\ 


1.77 


123. 14 


202. 22 


453. 40 


\ 


1.77 


123. 14 


202. 22 


453. 40 


.02 


<-0. 320> 


<1.257> 


1.757 


2. -864 


.02 


-0.80 


1.21 


1.72 


2.84 


.05 


<-0. 294> 


1.441 


1.985 


3. 195 


.05 


-0.60 


1.42 


1.96 


3. 19 


. 10 


<-0. 281> 


1.624 


2.213 


3.526 


. 10 


-0.44 


1.62 


2.21 


3.53 


.50 


<-0. 101> 


2.484 


3.278 


5.071 


.50 


0.04 


2.50 


3.29 


5.07 


.90 


<0. 679> 


3.857 


4.974 


7.529 


.90 


0.54 


3.86 


4.98 


7.57 


.95 


<1. 140> 


4.390 


5.633 


8.484 


.95 


0.71 


4.41 


5.67 


8.57 


.98 


<1.878> 


5. 101 


6.510 


9.753 


.98 


0.92 


5. 16 


6.65 


10.03 






WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 193 



larger than 1 percent. For any combination of vi and V2, and 7 = . 75, .90, .95, .98, the exact and the 
approximate values of Vp,y in both the Lawless study and the present study tend to show excellent 
agreement (less than about 1 percent error) in the range 

(3.1) 1/2^5.5, 2j/2<i'i<18^2-50. 

(This is similar in spirit to ranges given by Fertig and Mann [5] and Mann [13] applying to precision 
of approximate prediction intervals for Gaussian and WeibuU data, respectively.) 

If P=l— 6"' so that Xp=ix, there is little discrepancy between the exact and approximate 
values of Vp^y, .02<7:<.98, for r/n^A, n^25. For rjn^^A, n^l5, a chi-square approximation 
discussed in [15, 18, pp. 245-248] can be used. 

3.2. Gaussian Tolerance Bounds 

For evaluation of the F approximation for Gaussian data, tables of Locks, Alexander and 
Byars [10] of the noncentral ^-distribution were used. In each case considered, of course, r=n, and 
values of Ar, „, Br, n and Cr. n corresponding to unbiased versions of maximum-likelihood estimates 
were employed. We point out, however, that if alternatively, we had used values corresponding to 
variances and covariances of best linear unbiased estimators of n and cr (see Sarhan and Greenberg 
[21]), vi would not be changed at all and v^ would be changed by 1 percent or less. 

The noncentral i-variate can be defined by 

(X-y.-Zp,J)|{S|^J^ 

with noncentrality parameter 5= -Zp^ (and ti+Zp(T=Xp) and with degrees of freedom n-l. Here 

is the mean of n normal variatiates with expectation m, and 

S^ ±, {X,-Xy/in-l) 



bas expectation Cn<T, with c„ = V2/(n.— l)r(n/2)/r[(n — 1)/2]. Then since X and S are independent, 
one can form an approximate F-statistic as 

(3.2) iX-Xp)/(-ZpS/Cn) 

with 

'3. 3) vi=27izp'' 

and 

p-4) .2=2c„7(l-c„^). 

flien, percentiles of the distribution of the approximate i^-statistic (3.2) can be used to approximate 
)ercentiles of the noncentral i-distribution by multiplying the former by —Zp-\Jn/cn- In Table 2, 
ixact 1007th percentiles of the noncentral ^-distribution from tables of [9] for 7 = . 005, .01, .05, 
10, .25, .50, .75, .90, .95, .99, and -Sp=1.0(.50)3.0 are tabulated for a specified n-l, with n-l = 10 



194 



N. R. MANN 



and 35. For each value of n— 1 a tabulation of corresponding percentiles based on the F approxi- 
mation (3.2), (3.3) and (3.4) is exhibited. Approximate values in error by more than two per- 
cent are bracketed. 

It can be seen from Table 2 that the approximation is excellent for 7 =.5 or more (the range! 
of present interest) for n^9 (or i»2^27.1) when one is concerned with 0.14th through 16th per-l 
centiles of normal and lognormal distributions. The tabulated results, of course, apply also to 
S4th through 99.86th distribution percentiles when the sign of Zp is changed. The ranges over 
which the error in the F approximation is about 2 percent or less are given roughly by 



(3.5) 
(3.6) 
(3.7) 
(3.8) 



7=.99:i'2>:27, .4i'a+10<i'i<5.2;;2— 90 

7=.95:i'i>:27, .4j/,+ 10<ri<5.9.'2— 22 

7=.90:j'2^27, .4ya+10<i'i 

7=.50, .75:i'2>:27, .7./2+15<i'i 



Table 2. — Approximate (from 



.7)) and Exact Percentiles of the Noncentral t-distribution with Noncentrality 
Parameter —ynzp 



Approximate Values from (2.7); n— 1 = 10, i'2=39.1 


Exact Values from [9]; n-l = 10, i^=39.1 


-Zp 


1.00 


1.50 


2.00 


2.50 


3.00 


-Zp 


1. 00 


1.50 


2.00 


2. 50 


3.00| 


A" 


22.0 


49.4 


87.9 


137.5 


198.0 


yU 


22.0 


49.4 


87.9 


137. 5 


198. 


.005 

.01 

.05 

.10 

.25 

..50 

.75 

.90 

.95 

.99 


<1. 193> 

<1. 326> 

<2. 032 > 

<2. 418> 

<2. 582 > 

<3. 247> 

<4. 322 > 

5.435 

6.232 

8.067 


<2. 348> 
<2. 533 > 
<3. 472 > 
<3. 965> 

4. 171 

5. 116 
6.289 
7.600 
8.526 

<10. 628> 


3.482 
3.713 
4.430 
4.872 
5. 723 
6.864 
8.272 
9.836 
10. 936 
<13. 423> 


4. 585 

4.863 

5.718 

6.243 

7.254 

8.605 

10. 267 

12. 110 

<13. 404> 

<16. 327> 


5. 666 

5.990 

6.987 

7.598 

8. 772 

10. 342 

12. 270 

14. 406 

<15. 905> 

<19. 289> 


.005 

.01 

.05 

. 10 

.25 

.50 

.75 

.90 

.95 

.99 


0.725 
0.961 
1.608 
1.966 
2.606 
3.412 
4. 372 
5.434 
6. 193 
7.968 


2. 156 

2. 385 

3. 053 
.3.443 

4. 170 

5. 127 
6.308 
7.647 
8. 618 

10. 916 


3. 42S 
3.674 
4.414 
4.859 
5.705 
6.844 
8.276 
9.918 
11. 119 
13. 979 


4.618 
4.892 
5.729 
6.240 
7.223 
8.562 
10. 261 

12. 224 

13. 665 
17. 109 


5. 76ll 

6.069 

7.017 

7.601 

8. 73C 

10. 278 

12. 256 

14. 550 

16. 238 


Approximate Values from (2.7); n— 1 = 35, vj= 138.9 


Exact Values from [9]; n-l = 35, i^=138.9^ 1 


-Zp 


1.00 


1.50 


2.00 


2.50 


3.00 


-Zp 


1.00 


1.50 


2.00 


2.50 


3.00 


7^1-1 


72. 


162.0 


288.0 


450. 


648.0 


T^"! 


72.0 


162.0 


288.0 


450.0 


648. e 


.005 

.01 

.05 

. 10 

.25 

.50 

.75 

.90 

.95 

.99 


<3. 490> 
<3. 684 > 
4.602 
5.047 
5. 228 
6.011 
6.903 
7. 815 
8.413 
9.658 


5.958 
6. 206 
6.934 
7.356 

8. 123 

9. 069 

10. 131 

11. 201 
11.898 
13. 334 


8.373 

8.674 

9. 555 

10. 064 

19. 984 

12. 115 

13. 379 

14. 649 

15. 473 
17. 169 


10. 745 

11. 101 

12. 142 

12. 742 

13. 827 

15. 157 

16. 642 

18. 132 

19. 098 
21. 083 


13. 088 

13. 501 

14. 707 
1,5. 402 
16. 658 

18. 197 

19. 914 

21. 636 

22. 751 
25. 043 


.005 

.01 

.05 

. 10 

.25 

.50 

.75 

.90 

.95 

.99 


,3. 255 
3.499 

4. 188 
4.572 

5. 244 
6.048 
6.928 
7.799 
8. 364 
9.528 


5.873 

6. 139 

6.908 

7.345 

8. 124 

9.076 

10. 138 

11.206 

11.906 

13. 362 


8.368 

8.671 

9.554 

10. 062 

10. 977 

12. 106 

13. 377 

14. 667 

15. 517 
17. 294 


10. 794 

11. 141 

12. 158 

12. 747 

13. 812 

15. 136 

16. 634 

18. 161 

19. 169 
21. 285 


13. 17SJ 

13. 57< 

14. 73? 

15. 415 

16. 63i 

18. 16< 

19. 901 
21.67< 
22. 84; 
25. 31: 



WEIBULL AND LOGNORMAL TOLERANCE BOUNDS 195 

For .50<7<.90, the upper limit on ui = 2n2/ corresponds to the extreme upper or lower tail of a 
normal or lognormal distribution . 

These ranges should be applicable even when samples are censored. Thus, the approximation 
(2.6) can be used with best linear unbiased estimates of normal parameters to obtain lognormal 
tolerance bounds when vi and V2 given by (2.4) and (2.5) fall within one of the various ranges 
specified by (3.5) through (3.8). Also, work has begun by this author to generate constants for use 
in obtaining simplified linear estimates from large censored normal and lognormal samples. It 
should then be possible to use these in conjunction with the approximation (2.6). 



REFERENCES 

[1] Bain, L. J., "Inferences Based on Censored Sampling from the Weibull or Extreme-Value 
Distribution," Technometrics, 14, 693-702 (1962). 

[2] Billman, B. R., C. L. Antle, and L. J. Bain, "Statistical Inference from Censored Weibull 
Samples," Technometrics, 14, 831-840 (1972). 

[3] Box, G. E. P., "Some Theorems on Quadratic Forms Applied in the Study of Variance Problems, 
I. Effect of Inequality of Variance in the One-Wa}^ Classification," Ann. Math. Statist., 
65, 290-302 (1954). 

[4] Englehardt, M. and L. J. Bain, "Some Complete and Censored Sampling Results for the Wei- 
bull or Extreme-Value Distribution," Technometrics, 15, 541-549 (1973). 

[5] Fertig, K. W. and N. R. Mann, "A New Approach to the Determination of Exact and Approxi- 
mate One-Sided Prediction Intervals for Normal and Lognormal Distribtuions, with Tables ," 
in Reliability and Fault-Tree Analysis, R. Barlow, Ed., SIAM Series in Applied Mathematics 
(1974). 

[6] Grubbs, F. E., H. J. Coon, and E. S. Pearson, "On the Use of Patnaik Type Approximations to 
the Range in Significance Tests," Biometrika, 53, 248-252 (1966). 

[7] Lawless, J, F., "Conditional Versus Unconditional Confidence Intervals for the Parameters of 
the Weibull Distribution," J. Amer. Statist. Assoc, 68, 665-669 (1973). 

[8] Lawless, J. F., "Construction of Tolerance Bounds for the Extreme-Value and Weibull Distri- 
butions," Technometrics, 17, 255-261 (1975). 

[9] Lehman, E. L., Testing Statistical Hypotheses (John Wiley, New York 1959). 
[10] Locks, M. O., M. J. Alexander and B. J. Byars, "New Tables of the Noncentral t Distribu- 
tion," Aerospace Research Laboratories Report ARL 63-19, Aerospace Research Labora- 
tories, Wright-Patterson Air Force Base, Ohio (1963). 
[11] Mann, N. R., "Results on Location and Scale Parameter Estimation with Application to the 
Extreme-Value Distribution," Aerospace Research Laboratories Report ARL 67-0023; 
Office of A-erospace Research, U.S. Air Force, Wright-Patterson Air Force Base, Ohio (1967). 
; [12] Mann, N. R., "Tables for Obtaining the Best Linear Invariant Estimates of Parameters of the 

Weibull Distribution," Technometrics, 9, 629-645 (1967). 
J13] Mann, N. R., "Warranty Periods for Production Lots Based on Fatigue-Test Data," Engi- 
neering Fracture Mechanics, 8, 123-130 (1976). 
[14] Mann, N. R. and K. W. Fertig, "Tables for Obtaining Confidence Bounds and Tolerance 
Bounds Based on Best Linear Invariant Estimates of Parameters of the Extreme-Value 
Distribution," Technometrics, 15, 87-101 (1973). 



196 N. R. MANN 

[15] Mann, N. R. and K. W. Fertig, "Simplified Efficient Point and Interval Estimators for Wei- 
bull Parameters," Technometrics, 17, 361-368 (1975). 

[16] Mann, N. R., K. W. Fertig, and E. M. Scheuer, "Confidence and Tolerance Bounds and a 
New Goodness-of-Fit Test for Two-Parameters WeibuU or Extreme-Value Distributions 
with Tables for Censored Samples of Size 3(1)25," Aerospace Research Laboratories Report 
ARL 71-0077, Office of Aerospace Research, United States Air Force, Wright-Patterson 
Air Force Base, Ohio (1971). 

[17] Mann, N. R. and F. E. Grubbs, "Simple, Efficient Closed-Form Approximations for Beta 
Percentiles, Exponential Prediction Intervals and Confidence Bounds on Exponential and 
Binominal Parameters," J. Amer. Statist. Assoc, 66, 654-661 (1974). 

[18] Mann, N. R., R. E. Schafer, and N. D. Singpurwalla, Methods jor the Statistical Analysis qf 
Reliability and Life Data (John Wiley, New York 1974). 

[19] Patnaik, P. B., "The Non-Central x^ and F Distributions and Their Applications," Bio- 
metrika, 36, 202-232 (1949). 

[20] Pyke, R., "Spacings," J. Royal Statist. Soc. B, 27, 395-449 (1965). 

[21] Sarhan, A. E. and B. G. Greenberg, Contributions to Order Statistics (John Wiley 1962). 

[22] Thoman, D. R., L. J. Bain, and C. E. An tie, "Inferences on the Parameters of the Weibull 
Distribution," Technometrics, 11, 445-460 (1969). 

[23] van Montfort, M. A. J., "On Testing that the Distribution of Extremes is of Type I when I 
Type II is the Alternative," Journal of Hydrology, 11, 421-427 (1970). 



A NOTE ON A CONFIDENCE INTERVAL FOR AN INTERCLASS MEAN 



T. Jayachandran* 



Naval Postgraduate School 

Monterey, California 



ABSTRACT 



An exact confidence interval for an interclass mean, that is, the mean of a 
composite sample made up several subsamples of unequal sizes Ui is presented. 



I 



1. INTRODUCTION 

Suppose Xij,j=\, 2, . . ., Ui] i=l, 2, . . ., ^ is a composite sample of size 

that is comprised of k subsamples of sizes n<. The i*'' subsample is a random sample from a normal 
distribution Niauc"^) and the a/s, i=l, 2, . . ., k are assumed to be independent and identically 
distributed (i-i-d) as N{n, <t^). ^ is sometimes known as the interclass mean and the problem con- 
sidered in this note is the construction of a confidence interval for /x. Some practical uses of such 
in interval are discussed in a paper by Long [1]. For example, the composite sample may be measure- 
ments on a characteristic of the output of a factory made on different days, m would correspond 
bo the true mean value of the characteristic being studied. 

The construction of a confidence interval for y. is straightforward if all the n^ are equal. For 
mequal subsample sizes Long [1] obtained an approximate interval that he shows to be reasonably 
iccurate. The procedure proposed in this paper leads to an exact confidence interval for m in the 
ase of unequal n^. 

!. PRELIMINARIES 

The observations X<y,j= 1, 2, . . .,ni;i= 1,2, . . ., ^ may be assumed to satisfy the variance 
;omponents model 

!2.1) Yi,=ai+eij,j=l,2, . . .,n;i=l,2, . . ., k 



♦This research was supported by the Office of Naval Research as part of the Foundation Research Program at 
he Naval Postgraduate School. 

197 



198 T. JAYACHANDRAN 

where the a,'s are i-i-d. Nifia-a^), the a/s are i-i-d. N{0, cr^) and the a,'s and ey's are mutually 
independent. Let 

and _ _ 



Then, X<, t=l, 2, . . ., /: are independent and normally distributed 

If nt = n for all i the X< will constitute a random sample from a normal distribution 

Thus, 

(2.2) t J^'O^-^) 

has a student's ^ distribution with {k-\) degrees of freedom, and this leads to a confidence interval 
or ti. 

For the case where the rit are not all equal, Long [1] obtained an approximate confidence 
interval for n by treating 7" as a student's t variable. On the basis of a study of the exact distribution 
fof T for k=2, 3 and an examination of the moments of T for large values of k, Long [1] concludes 
that the t approximation is fairly reliable. He also points out that the approximation is likely to go 
wrong if there are wide variations in the subsample sizes n< or if cr^ is large relative to o^. 

3. EXACT CONFIDENCE INTERVAL WITH UNEQUAL SUBSAMPLE SIZES 

For unequal subsample sizes, the Xi are not identically distributed since the variance of 

2 

Xi is <^a' + -' 

even though all of them have the same mean /x. Thus, the construction of a student's t variable^ 
based on Xt only, independent of the nuisance parameters <Ta^-\-(T^/n,, is not possible. To get around 
this difficulty let 

(3.1) Z,=CuXu+C,,J^t i=l, 2,...k 

where 

<niC-l\'' 



6«j=l — C(j 



INTERCLASS MEAN CONFIDENCE ESTTERVAL 199 

and 



min Ui 
Then, as shown in the appendix, 

min Ui 

If 

— 1 * 

K 1 = 1 

and 

s/=~ i: (z,-zr 

K-l t = l 

then 

(3.2) j,_ B-"(Z-M.) 

has a student's f-distribution with ^-1 degrees of freedom. An exact confidence interval for fx is 
now obtainable in the usual way. 

4. DISCUSSION 

The exact procedure proposed in Section 3 is somewhat ad hoc in nature in the sense that the 
first observation Xn in each subsample plays a prominent role. To avoid any systematic bias 
that may creep in, it would be preferable to randomly permute the observations in each of the 
'fc subsamples before applying the procedure. If all the subsample sizes Ut are equal to n, the statistic 
j(3.2) reduces to the usual T statistic (2.2). 



. APPENDIX 

Under the assumptions of the variance components model (2.1) 

E{Xii)=ii j=l, 2, . . ., Ui] i=l, 2, . . .,k 
Var {X,,)=a'+a,' 

cov(x,,x.,)={^^. j^j;.^., 

E{Xd=u^ 



uid 



t follows 



md 



Gov (X„, X,)=<ra'+^ j=l, 2, . . ., n, 



200 

Therefore, if 

then 
and 



T. JAYACHANDRAN 



substitution of 



results in 



F(Z0 = Cn^T(Xa) + (7«^V(Z,)+2CaC^*2 Gov (Z^, Zj 



(7,i=('^l^Y/^ (?,2=l-(7n and (7=-J— 

minrii 



£:(ZO=Mandy(Z,)=(raH 



minn^ 



REFERENCES 

1. Long, W. M., "Estimation Problems When a Simple Type of Heterogeneity ite Present in the 
Sample," Biometrika, S8, 90-101 (1951). 



AU.S. GOVERNMENT PRINTING OFFICE: 1977240-830/1 1-3 



INFORMATION FOR CONTRIBUTORS 

The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of 
scientific information in logistics and will publish research and expository papers, including those 
in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve 
the efficiency and effectiveness of logistics operations. 

Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL 
RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. 
Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one 
or more referees. 

Manuscripts submitted for publication should be typewritten, double-spaced, and the author 
should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted 
with the original. 

A short abstract (not over 400 words) should accompany each manuscript. This will appear 
at the head of the published paper in the QUARTERLY. 

There is no authorization for compensation to authors for papers which have been accepted 
for publication. Authors will receive 250 reprints of their published papers. 

Readers are invited to submit to the Managing Editor items of general interest in the field 
of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections 
of the QUARTERLY. 



NAVAL RESEARCH 

LOGISTICS 

QUARTERLY 



MARCH 1977 
VOL. 24, NO. 1 

NAVSO P-1278 



I 



CONTENTS 



ARTICLES 



A Two-Echelon Inventory Model with Purchases, Dis- 
positions, Shipments, Returns and Transshipments 

Optimal Reject Allowance with Constant Marginal Pro- 
duction Efficiency 

A Chance-Constrained Distribution Problem 



Elements of a Theory in Non-Convex Programming 

Convex and Polaroid Extensions 

A Cutting Plane Algorithm for the Bilinear Programming 
Problem 

The Effect of Correlated Exponential Service Times on 
Single Server Tandem Queues 

Single-Lane Bridge Serving Two-Lane Traffic 



Optimal Control for Multi-Servers Queueing Systems under 
Periodic Review 



Cyclical Job Sequencing on Multiple Sets of Identical 
Machines 



Johnson's Approximate Method for the 3 X n Job Shop 
Problem 

A Convex Property of an Ordered Flow Shop Sequencing 
Problem 

A Manpower Planning/Capital Budgeting Model (MAP- 
CAB) 

An F Approximation for Two-Parameter Weibull and 
Lognormal Tolerance Bounds Based on Possibly 
Censored Data 

A Note on a Confidence Interval for an Interclass Mean 



B. HOADLEY 1 

D. P. HEYMAN 

A. BEJA 21 



R. M. REESE 
A. C. STEDRY 

C. BURDET 

C. BURDET 

H. VAISH 

C. R. MITCHELL 
A. S. PAULSON 
C. A. BESWICK 

Z. ESHCOLI 
I. ADIRI 

C. C. HUANG 

S. L. BRUMELLE 

K. SAWAKI 

I. VERTINSKY 



35 

47 
67 
83 

95 

4 

113 

127 

I 

H. I. STERN 137 
E. P. RODRIGUEZ 
M. L. UTTER 

W. SZWARC 153 
G. K. HUTCHINSON 

S. S. PANWALKAR 159 
A. W. KHAN 

R. H. CLARK 163 
R. A. COMERFORD 

N. R. MANN 187 



T. JAYACHANDRAN 197 



OFFICE OF NAVAL RESEARCH 
Arlington, Va. 22217